Wagon Trains to the Cloud

This is the first post in a series, on the challenges organisations face when they move from having online content and tools hosted firmly on their estate to renting space in the cloud.  We will help you to consider the options and guide you on the steps you need to take.

In this first post we show you  the most common challenges that you are likely to face and how you may overcome these.

A fast migration path, to become tenants in a cloud apartment housing unfolds a set of business critical issues that have to be mitigated:

  • Wayfinding in a maze of content buckets and social habitats.
  • Emerging digital Ghost Towns due to lack of information governance.
  • Digital Landfills without organising principles for information and data.
  • Digital Litter with little or no governance or principles for ownership, with redundant, outdated and trivial (ROT) content.
  • With no strategy or plan, erodes any possibility to positive business outcome from moving to the clouds.

WagonTrn.jpg
WagonTrn” by Tillman at en.wikipedia – Transferred from en.wikipedia by SreeBot. Licensed under Public domain via Wikimedia Commons.

The way forward is to settle a sustainable information architecture, that supports an information environment in constant flux. With information and data interoperable on any platform, everywhere, anytime and on any device.

You need to show how everything is managed and everyone fits together.  A governance framework can help do this.  It can show who is responsible for the intranet, what their responsibilities are and fit with the strategy and plan.  Making it available to everyone on the intranet helps their understanding of how it is managed and supports the business.

The main point is to have a governance framework and information architecture with the same scope to avoid gaps in content being managed or not being found.

Both need to be in harmony and included in any digital strategy.  This avoids competing information architectures and governance frameworks being created by different people that causes people to have inconsistent experiences not finding that they need and using alternative, less efficient, ways in future to find what they need to help with their work.

The way forward is to settle a sustainable information architecture, that supports an information environment in constant flux. With information and data that is interoperable on any platform, everywhere, anytime and on any device.

You need to be able to show how everything is managed and everyone fits together.  A governance framework can help you do this.  It can show who is responsible for the intranet, what their responsibilities are and how it fits with the strategy and plan.  Making it available to everyone on the intranet helps in their understanding of how it is managed and how it supports the business.

The main point is to have a governance framework and information architecture with the same scope in order to avoid gaps in managing content or be able to find it.

Both needs to be in harmony and included in any digital strategy.  This avoids competing information architectures and governance frameworks being created by different people that causes people to have inconsistent experiences not finding what they need and using alternative, less efficient, ways in the future to find what they need to support them in their work tasks.

Background

Building huts, houses and villages is an emerging social construction. As humans we coordinate our common resources, tools and practices. A habitat populated by people needs housekeeping rules with available resources for cooking, cleaning, social life and so on. Routines that defines who does what task and by when in order to keep everything ok.

A framework with governing principles that set out roles and responsibilities along with standards that set out the expected level of quality and quantity of each task that everyone is engaged and complies with, is similar to how the best intranets and digital workplaces are managed.

In the early stages with a small number of habitats the rules for coordination are pretty simple, both for shared resources between the groups and pathways to connect them. The bigger a village gets, it taxes the new structures to keep things smooth. When we move ahead into mega cities with 20+ million people living close, it boils down to a general overarching plan and common infrastructures, but you also need local networked communities, in order to find feasible solutions for living together.

Like villages and mega cities there is a need for consistency that helps everyone to work and live together.  Whenever you go out you know that there are pavements to walk on, roads for driving, traffic lights that we stop at when they turn red and signs to help us show the easiest way to get to our destination.

Sustainable architecture and governance creates a consistent user experience. A well structured information architecture that is aligned with a clear governance framework sets out roles and responsibilities. Publishing standards based on business needs that supports the publishers follow them. This means wherever content is published, whether it is accredited or collaborative, it will appear to be consistent to people and located where they expect it to be.  This encourages a normal way to move through a digital environment with recognizable headings and consistently placed search and other features.

This allegori, fits like a glove when moving into large enterprise-wide shared spaces for collaboration. Whether it is cloud based, on-premises or a mix thereof. The social constructions and constraints still remain the same. As an IT-services on tap, cloud, has certainly constraints for a flexible and adjustable habitual construction to be able to host as many similar habitats as possible. But offers a key solutions to instantly move into! Tenants share the same apartment building (Sharepoint online).

When the set of habitats grow, navigation in this maze becomes a hazard for most of us. Wayfinding in a digital mega city, is extremely difficult. To a large extent, enterprises moving into collaboration suites suffer from the same stigma. Regardless if it is SharePoint, IBM Connections, Google Apps for Work, or a similar setting. It is not a discussion of which type of house to choose, but rather which architecture and plan that work in the emerging environment.

Information Architecture for Digital Habitats

If one leans upon linked-data,  linked-open-data, and emerging semantic web and web of data standards, there are a set of very simple guidelines that one should adhere to when building a Digital Village or Mega City. The 5 stars, our beacon of light!

All collections and shared spaces, should have persistent URI:s, which is the fourth star in the ladder. When it comes to the third star of non-proprietary formats it obviously becomes a bit tricky, since i.e. MS Sharepoint and MS Office like to encourage their own format to things. But if one add resource descriptions to collections and artifacts using Dublin Core elements, it will be possible to connect different types of matter. With feasible and standardised resource descriptions it will be possible to add schemas and structures, that can tell us a little bit more about the artifacts or collection thereof. Hence the option to adhere to the second star. The first star, will inside the corporate setting become key to connect different business units, areas with open licenses and with restrictions to internal use only and in some cases open for other external parties.

Linking data-sets, that is collections or habitats, with different artifacts is the fifth star. This is where it all starts to make sense, enabling a connected digital workplace. Building a city plan, with pathways, traffic signals and rules, highways, roads, neighborhoods  and infrastructural services and more.

We will cover more about how this applies to Office 365 and SharePoint in our next post.

Please join our live breakfast talk in Gothenburg, or online seminar and final panel discussion and Q&A using G+ Hangout, the 20th November 8.00AM – 10AM Central European Time
View Fredric Landqvist's LinkedIn profileFredric Landqvist research blog
View Mark Morrell's LinkedIn profileMark Morell intranet-pioneer

Enterprise-Linked-Data and the Connected Digital Workplace

The emerging hyper-connected and agile enterprises of today are stigmatised by their IS/IT-legacy, so the question is: Will emerging web and semantic technologies and practices undo this stigma?

The Shift

Semantic Technologies and Linked-Open-Data (LOD) have evolved since Tim Berners-Lee introduced their basic concepts, and they are now part of everyday business on the Internet, thanks mainly due to their uptake by information and data-run companies like Google, social networks like Facebook and large content sites, like Wikipedia. The enterprise information landscape is ready to be enhanced by semantic web, to increase findability and usability. This change will enable a more agile digital workplace where members of staff can use cloud based services, anywhere, anytime on any device, in combination with the set of legacy systems backing their line-of-business. All in all, more efficient organising principles for information and data.

The Corporate Information Landscape of today

In everyday workplace we use digital tools to cope with the tasks at hand. These tools have been set into action to address meta models to structure the social life dealing with information and data. The legacy of more than 60 years of digital records keeping, has left us in an extremely complex environment, where most end-users have a multitude of spaces where they are supposed to contribute. In many cases their information environment lacks interoperability.

A good, or rather bad example of this, is the electronic health records (EHR) of a hospital, where several different health professionals try to codify their on-going work in order to make better informed decisions regarding the different medical treatments. While this is a good thing, it is heavily hampered with closed-down silos of data that do not work in conjunction with the new more agile work practices. It is not uncommon to have more than 20 different information systems employed to do provisioning during a workday.

The information systems architecture, in any organisation or enterprise, may comprise of home-grown legacy systems from the past, or bought off-the-shelf software suites and extremely complex enterprise-wide information systems like ERP, BI, CRM and the like. The connections between these information systems (or integration points) often resemble “spaghetti” syndrome, point-to-point. The work practice for many IT professionals is to map this landscape of connections and information flows, using for example Enterprise Architecture models. Many organisations use information integration engines, like enterprise-service-bus applications, or master data applications, as means to decouple the tight integration and get away from the proprietary software lock-in.

On top of all these schema-based, structured data, information systems, lies the social and collaborative layer of services, with things like intranet (web based applications), document management, enterprise wide social networks (e.g. Yammer) and collaborative platforms (e.g SharePoint) and more obviously e-mail, instant messaging and voice/video meeting applications. All of these platforms and spaces where one  carries out work tasks, have either semi-structured (document management) or unstructured data.

Wayfinding

A matter of survival in the enterprise information environment, requires a large dose of endurance, and skills. Many end-users get lost in their quest to find the relevant data when they should be concentrating on making well-informed decisions. Wayfinding is our in-built adaptive way of coping with the unexpected and dealing with it. Finding different pathways and means to solve the issues. In other words … Findability.

Outside-in and Inside-Out

Today most organisations and enterprises workers act on the edge of the corporate landscape – in network conversations with customers, clients, patients/citizens, partners, or even competitors, often employing means not necessarily hosted inside the corporate walls. On the Internet we see newly emerging technologies become used and adapted at a faster rate and in a more seamless fashion than the existing cumbersome ones of the internal information landscape. So the obvious question raised in all this flux is: why can’t our digital workplace (the inside information landscape) be as easy to use and to find things / information as in the external digital landscape? Why do I find knowledgeable peers in communities of practice more easily outside than I do on the inside? Knowledge sharing on the outpost of the corporate wall is vivid, and truly passionate whereas inside it is pretty stale and lame to say the least.

Release the DATA now

Aggregate technologies, such as Business Intelligence and Datawarehouse, use a capture, clean-up, transform and load mechanism (ETL) from all the existing supporting information systems. The problem is that the schemas and structures of things do not compile that easily. Different uses and contexts make even the most central terms difficult to unleash into a new context. This simply does not work. The same problem can be seen in the enterprise search realm where we try to cope with both unstructured or semi-structured data. One way of solving all this is to create one standard that all the others have to follow and including a least common denominator combined with master data management. In some cases this can work, but often the set of failures fromsuch efforts are bigger than those arising from trying to squeeze an enterprise into a one-size-fits-all mega-matrix ERP-system.

Why is that? you might ask, from the blueprint it sounds compelling. Just align the business processes and then all data flows will follow a common path. The reality unfortunately is way more complex because any organisation comprises of several different processes, practices, professions and disciplines. These all have a different perspectives of the information and data that is to be shared. This is precisely why we have so many applications in the first place! To what extent are we able to solve this with emerging semantic technologies? These technologies are not a silver bullet, far from it! The Web however shows a very different way of integration thinking, with interoperability and standards becoming the main pillars that all the other things rely on. If you use agreed and controlled vocabularies and standards, there is a better chance of actually being able to sort out all the other things.

Remember that most members of staff, work on the edges of the corporate body, so they have to align themselves to the lingo from all the external actor-networks and then translate it all into codified knowledge for the inside.

Semantic Interoperability

Today most end-users use internet applications and services that already use semantic enhancements to bridge the gap between things, without ever having to think about such things. One very omnipresent social network is Facebook, that relies upon the FOAF (Friend-of-a-Friend) standard for their OpenGraph. Using a Graph to connect data, is the very corner stone of linked-data and the semantic web. A thing (entity) has descriptive properties, and relations to other entities. One entity’s property might be another entity in the Graph. The simple relationship subject-predicate-object. Hence from the graph we get a very flexible and resilient platform, in stark contrast to the more traditional fixed schemas.

The Semantic Web and Linked-Data are a way to link different data sets that may grow from a multitude of schemas and contexts into one fluid interlinked experience. If all internal supporting systems or at least the aggregate engines could simply apply a semantic texture to all the bits and bytes flowing around, it could well provide a solution to the area where other set ups have failed. Remember that these linked-data sets are resilient by nature.

There is a  set of controlled vocabularies (thesauri, ontologies and taxonomies) that capture all the of topics, themes and entities that make up the world. These vocabs have to some extent already been developed, classified and been given sound resource descriptors (RDF). The Linked-Open-Data clouds are experiencing a rapid growth of meaningful expressions. WikiData, dbPedia, Freebase and many more ontologies have a vast set of crispy and useful data that when intersected with internal vocabularies, can make things so much easier. A very good example of such useful vocabularies, are the ones developed by professional information science people is that of the Getty Institute’s recently released thesari for AAT (Arts and Architecture), CONA (Cultural Object Authority) and TGN (Geographical Names). These are very trustworthy resources, and using linked-data anybody developing a web or mobile app can reuse their namespace for free and with high accuracy. And the same goes for all the other data-sets in the linked-open-data cloud. Many governments have declared open data as the main innovation space in which to release their things, under the realm of the “Commons”.

Inaddition to this, all major search engines have agreed on a set of very simple-to-use schemas captured in the www.schema.org world. These schemas have been very well received from their very inception by the webmaster community. All of these are feeding into the Google Knowledge Graph and all the other smart-things (search-enabled) we are using daily.

From the corporate world, these Internet mega-trends, have, or should have, a big impact on the way we do information management inside the corporate walls. This would be particularly the case if the siloed repositories and records were semantically enhanced from their inception (creation), for subsequent use and archiving. We would then see more flexible and fluid information management within the digital workplace.

The name of the game is interoperability at every level: not just technical device specifics, but interoperability at the semantic level and at the level we use governing principles for how we organise our data and information, regardless of their origin.

Stepping down, to some real-life examples

In the law enforcement system in any country, there is a set of actor-networks at play: the police, attorneys, courts, prisons and the like. All of them work within an inter-organisational process from capturing a suspect, filing a case, running a court session, judgement, sentencing and imprisonment; followed at the end by a reassimilated member of society.  Each of these actor-networks or public agencies have their own internal information landscape with supporting information systems, and they all rely on a coherent and smooth flow of information and data between each other. The problem is that while they may use similar vocabularies, the contexts in which they are used may be very different due to their different responsibilities and enacted environment (laws, regulations, policies, guidelines, processes and practices) when looking from a holistic perspective.

IA LOD Innovation

A way to supersede this would be to infuse semantic technologies and shared controlled vocabularies throughout, so that the mix of internal information systems could become interoperable regardless of the supporting information system or storage type. In such a case linked-open-data and semantic enhancements could glue and bridge the gaps to form one united composite, managed by just one individual’s record keeping. In such a way, the actual content would not be exposed, rather a metadata schema would be employed to cross any of the previously existing boundaries.

This is a win-win situation, as semantic technologies and any linked-open-data tinkering use the shared conversation (terms and terminologies) that already exists within the various parts of the process. While all parts cohere to the semantic layers, there is no need to reconfigure  internal processes or apply other parties’ resource descriptions and elements. In such a way only parts of schemas are used that are context specific for a given part of a process, and so allowing the lingo of the related practices and professions to be aligned.

This is already happening in practice in the internal workplace environment of an existing court, where a shared intranet is based on such organising principles as already mentioned, uses applied sound and pragmatic information management practices and metadata standards like Dublin Core and Common Vocabularies –  all of which are infused in Content Provisioning.

For the members of staff, working inside a court setting, this is a major improvement, as they use external databases everyday to gain insights in order to carry out their duties. And when the internal workplace uses such a set up, their knowledge sharing can grow –  leading to both improved wayfinding and findability.

Yet another interesting case, is a service company that operates on a global scale. They are an authoritative resource in their line-of-business, maintaining a resource of rules and regulations that have become a canonical reference. By moving into a new expanded digital workplace environment (internet, extranet and intranet) and using semantic enhancement and search, they get a linked-data set that can be used by clients, competitors and all others working within their environment. At the same time their members of staff can use the very same vocabularies to semantically enhance their provision of information and data into the different information systems internally.

The last example is an industrial company with a mix of products within their line-of-business. They have grown through M&A over the years, and ended up in a dead-end mess of information systems that do not interoperate at all. A way to overcome the effect of past mergers and aquisitions, was to create an information governance framework. Applying it  with MDM and semantic search they were able to decouple data and information, and as a result making their workplace more resilient in a world of constant flux.

One could potentially apply these pragmatic steps to any line of business, since most themes and topics have been created and captured by the emerging semantic web and linked-data realm. It is only a matter of time before more will jump on this bandwagon in order to take advantage of changes that have the ability to make them a canonical reference, and a market leader. Just think of the film industry’s IMDB.

A final thought: Are the vendors ready and open-minded enough to alter their software and online services in order to realise this outlined future enterprise information landscape?

For more information please read these online resources, or go for the executive brief video clip:
Enterprise-Linked-Data
http://testing.rachaelkalicun.info/led_book/led-contents.html

Exec Brief

Europeana brief for memory institutions using linked-open-data:
http://en.wikipedia.org/wiki/File:Linked-open-data-Europeana-video.ogv

Linked-Open-Data network Sweden 2014 presentation:
http://livingarchives.mah.se/2014/03/linked-data-2014/
and Fredric’s talk about semantic enhanced citizen participation and slides.

The future linked-data enterprise, from Intranätverk conference in Göteborg, in May 2014
Fredric Landqvist and Kerstin Forsbergs’s talk, and slides.

Gamification in Information Retrieval

My last article was mainly about Collaborative Information Seeking – one of the trends in enterprise search. Another interesting topic is the use of games’ mechanics in CIS systems. I met up with this idea during previously mentioned ESE 2014 conference, but interest is so high, that this year in Amsterdam a GamifIR (workshops on Gamification for Information Retrieval) took place. IR community have debated about what kind of benefits can IR tasks bring from games’ techniques. Workshops cover gamified task in context of searching, natural language processing, analyzing user behavior or collaborating. The last one was discussed in article titled “Enhancing Collaborative Search Systems Engagement Through Gamification” and has been mentioned by Martin White in his great presentation about search trends on last ESE summit.

Gamification is a concept which provides and uses game elements in non-game environment. Its goal is to improve customers or employees motivation for using some services. In the case of Information Retrieval it is e.g. encouraging people to find information in more efficient way. It is quite instinctive because competition is  an inherent part of human nature. Long time ago, business sectors have noticed that higher engagement, activating new users and establishing interaction between them, rewarding the effort of doing something lead to measurable results. Even if quality of data given by users could be higher. Among those elements can be included: leaderboards, levels, badges, achievements, time or resources limitation, challenges and many others. There are even described design patterns and models connected with gameplay, components, game practices and processes. Such rules are essential because virtual badge has no value until being assigned by user.

Collaborative Information Seeking is an idea suited for people cooperating on complex task which leads to find specific information. Systems like this support team work, coordinate actions and improve communication in many different ways and with usage of various mechanisms. At first glance it seems that gamification is perfect adopted to CIS projects. Seekers become more social, feeling of competence foster actions which in turn are rewarded.

The most important thing is to know why do we need gamified system and what kind of benefits we will get. Next step is to understand fundamental elements of a game and find out how adopt them to IR case. In their article “Enhancing Collaborative Search Systems Engagement Through Gamification”, researchers of Granada and Holguin universities have listed propositions how to gamify CIS system.  Based on their suggestions I think essential points are to prepare highly sociable environment for seekers. Every player (seeker) needs to have own personal profile which stores previous achievements and can be customized. Constant feedback on progress, list of successful members, time limitations, keeping the spirit of competition by all kinds of widgets are important for motivating and building a loyalty. Worth to remember that points collected after achieving goals need to be converted into virtual values which can distinguish the most active players. Crucial thing is to construct clear and fair principles, because often information seeking with such elements is a fun and it can’t be ruined.

Researchers from Finnish universities, who published article “Does Gamification Work?”, have broken down a problem of gamifying into components and have thoroughly studied them. Their conclusion was that concept of gamification can work, but there are some weaknesses – context which is going to be gamified and the quality of the users. Probably, the main problem is lack of knowledge which elements really provide benefits.

Gamification can be treated as a new way to deal with complex data structures. Limitations of data analyzing can be replaced by mechanism which increase activity of users in Information Retrieval process. Even more – such concept may leads to more higher quality data, because of increased people motivation. I believe, Collaborative Information Seeking, Gamification and similar ideas are one of the solutions how to improve search experience by helping people to become better searchers than not by just tuning up algorithms.

New look for the GSA-powered file share search at Implement Consulting Group

The file share search on Implement Consulting Group’s intranet is driven by a Google Search Appliance (GSA). Recently, with help from Findwise, the search interface was given a new look, that integrates more seamlessly with the overall design of the intranet.

GSA comes with a default search interface similar to the Google.com search. The interface is easy to customize from GSA’s administrative interface, however, some features are simply not customizable by clicking around. Therefore, GSA supports the editing of an XSLT file for customizing the search. GSA returns the search results in XML format, and by processing this file with XSLT we can customise how the search results look and behave.

Custom CSS and JavaScript was used for integrating GSA’s search functionalities in the look and feel of the intranet. Implement’s new intranet is based on thoughtfarmer.com and the design was delivered by 1508.dk.

– And here is the search results page with a new look:

icg-gsa-screenshot-findwise

The new look of the search results page on Implement Consulting Group’s Google Search Appliance powered search

The search experience in SharePoint 2013: customised or targeted?

This post is the fourth in a series of four articles providing several best practices on how to implement and customise the search experience in SharePoint 2013. The previous posts listed the differences between the cloud and on-premise SharePoint, provided considerations when upgrading to SharePoint 2013, and dealt with the practicalities of configuring search in SharePoint Online. This fourth post handles the more advanced topic of ranking results and the future of search in SharePoint.

Managing ranking

We’ve previously mentioned the query rules as a way to change the ranking of the search results based on your requirements. These allow the promotion of certain search results or search result blocks on top of the ranked searched results, and more advanced query rules allow even changing the ranking of the search results based on what the query terms are.

By using query rules, customising the search results web part, and a few content by search web parts, you can change the behaviour of the search depending on what user is accessing it. That is, you would also need good metadata to make this work, but having a complete user profile (including the job title, department, and interests) is a good start. Based on such user information, you can define how the search experience for that user will be.

Changing ranking using query rules, however, requires a query rule condition, which describes the prerequisites that the query must fulfil in order for the query rule to fire. For changing the results for all queries, you can use the next approach.

If the default ranking does not satisfy your search requirements and you want to change the order of the ranked search results, SharePoint provides the possibility of changing the ranking models. It is a feature available in SharePoint Online as well, as described in the TechNet documentation: “SharePoint Online customers need to download and install the free Rank Model Tuning App in order to create and customize ranking models.”

A ranking model contains the features and corresponding weights that are used in calculating a score for each search result. Changing the ranking models might require a deeper and theoretical knowledge of how search works, and those that take the challenge of changing the ranking model are often dedicated search administrators or external specialised consultants.

The Ranking Model Tuning app is free on the App Store - http://office.microsoft.com/en-001/store/ranking-model-tuning-WA104192565.aspx

The Ranking Model Tuning app is free on the App Store

The Rank Model Tuning App provides a user interface for creating custom ranking models, and can be used for both SharePoint Online and SharePoint Server, though in SharePoint 2013 Server there is also the possibility to use PowerShell to customise ranking models. New models are based on existing ranking models for which you can add or remove new rank features and tune the weight of a rank feature. It also allows for evaluating the new ranking model using a test set of queries. The set of test queries can be constructed from real queries made by users that can be gathered from previous search logs, for example. How to use the tuning app is explained step-by-step in the documentation on the Office site.

Changing the weight of certain file types (say for example for PowerPoint documents compared to Excel documents) might be enough for many search implementations, but depending on the content, the features that influence the ranking of the search results can become more elaborate. For example, a property defining whether documents are either official or work-in-progress might become an important factor in determining the ranking of search results. SharePoint provides the liberty to create new properties, so it makes sense that these can be used in search to improve the relevance.

It should be pointed out, however, that changing the ranking model influences all searches that are run using that ranking model. Though the main idea of changing the ranking model is to improve the ranking, it can become much too easy to make changes that can have an undesirable effect on the ranking. This is why a proper evaluation of ranking changes needs to be part of your plan for improving search relevance.

The office graph and the future of social

The social features introduced in SharePoint 2013 provide a rich social experience, which is interconnected with the search experience. Many social features are driven by search (such as the recommendations for which people or documents to follow), and social factors also affect the search (such as finding the right expertise from conversations in your network).

In the month of June 2012 Microsoft acquired the social enterprise platform Yammer. The SharePoint Server 2013 Preview has been made available for download since July 2012, and it reached Release to Manufacturing (RTM) in October the same year. The new SharePoint 2013 implements new social features (see for example the newsfeed, the new mysites and the tagging system), many of which are overlapping with those available in Yammer! This brings us to the question on everyone’s mind since the acquisition of Yammer: what is the future of social in SharePoint? Should you use SharePoint’s social features or use Yammer?

In March 2014, Microsoft announced that they will not include new features in the SharePoint Social but rather invest in the integration between Yammer and Office 365. The guidance is thus to go for Yammer.

“Go Yammer! While we’re committed to another on-premises release of SharePoint Server—and we’ll maintain its social capabilities—we don’t plan on adding new social features. Our investments in social will be focused on Yammer and Office 365” – Jared Spataro, Microsoft Office blog

Also at the SharePoint conference this March 2014, Microsoft introduced the Office Graph, and with it Oslo as the first app demo using it. During the keynote, Microsoft mentions that the Office Graph is “perhaps the biggest idea we’ve had since the beginning of SharePoint”. The office graph maps relationships between people, the documents they authored, the likes and posts they made, and the emails they received; it’s actually an extension of Yammer’s enterprise graph. The Oslo application is leveraging the graph, in a way that looks familiar from Facebook’s graph search.

The Office Graph, connecting people and information - Microsoft Office Blog http://blogs.office.com/2014/03/03/work-like-a-network-enterprise-social-and-the-future-of-work/

The Office Graph, connecting people and information – Microsoft Office Blog

The new Office Graph provides exciting opportunities, and has consequences for how the search will be used. Findwise started exploring the area of enterprise graph search before Microsoft announced the Office Graph – see our post about the Enterprise Graph Search from January 2013.

Reluctant to go for the cloud?

Microsoft has hinted during the SharePoint conference keynote in March that they will be adding new functionalities to the cloud version first. Although they are still committed to another version of SharePoint server, new updates might come at a slower pace for the on-premise version. However, Microsoft also announced that with the SharePoint SP1 there is a new functionality in the administrative interface: a hybrid setting which allows you to specify whether you want the social component in the cloud/Yammer, or your documents on OneDrive, so that you don’t need to move everything to the cloud overnight.

Let us know how far you’ve come with your SharePoint implementation! Contact us if you need help in deciding which version of SharePoint to choose, need help with tuning search relevance, have questions about improving search, or would like to work with us to reach the next level of findability.

Enterprise Search Europe 2014 – Short Review

ESE Summit

At the end of April  a third edition of Enterprise Search Europe conference took place.  The venue was Park Plaza Victoria Hotel in London. Two-day event was dedicated to widely understood search solutions. There were two tracks covering subjects relating to search management, big data, open source technologies, SharePoint and as always -  the future of search. According to the organizer’ information, there were 30 experts presenting their knowledge and experience in implementation search systems and making content findable. It was  opportunity to get familiar with lots of case studies focused on relevancy, text mining, systems architecture and even matching business requirements. There were also speeches on softer skills, like making  decisions or finding good  employees.

In a word, ESE 2014 summit was great chance to meet highly skilled professionals with competence in business-driven search solutions. Representatives from both specialized consulting companies and universities were present there. Even second day started from compelling plenary session about the direction of enterprise search. Presentation contained two points of view: Jeff Fried, CTO in BA-Insight and Elaine Toms, Professor of Information Science, University of Sheffield. From industrial aspect analyzing user behavior,  applying world knowledge or improving information structure is a  real success. On the other hand, although IR systems are currently in mainstream, there are many problems: integration is still a challenge, systems working rules are unclear, organizations neglect investments in search specialists. As Elaine Toms explained, the role of scientists is to restrain an uncertainty by prototyping and forming future researchers. According to her, major search problems are primitive user interfaces and too few systems services. What is more, data and information often become of secondary importance, even though it’s a core of every search engine.

Trends

Despite of many interesting presentations, particularly one caught my attention. It was “Collaborative Search” by Martin White, Conference Chair and Managing Director in Intranet Focus. The subject was current condition of enterprise search and  requirements which such systems will have to face in the future. Martin White is convinced that limited users satisfaction is mainly fault of poor content quality and insufficient information management. Presentation covered  absorbing results of various researches. One of them, described in “Characterizing and Supporting Cross-Device Search Tasks” document, was analysis of commercial search engine logs in order to find behavior patterns associated with cross device searching. Switching between devices can be a hindrance because of device multiplicity. That is why each user needs to remember both what he was searching and what has already been found. Findings show that there are lots of opportunities to handle information seeking more effectively in multi-device world. Saving and re-instating user session, using time between switching devices to get more results or making use of behavioral, geospatial data to predict task resumption are just a few examples of ideas.

Despite everything, the most interesting part of Martin White’s presentation was dedicated to Collaborative Information Seeking (CIS).

Collaborative Information Seeking

It is natural that difficult and complex tasks forced people to work together. Collaboration in information retrieval helps to use systems more effectively. This idea concentrate on situations when people should cooperate to seek information or sense-make. In fact, CIS covers on the one hand elements connected with organizational behavior or making decision, on the other – evolution of user interface and designing systems of immediate data processing. Furthermore, Martin White considers CIS context to be focused around the complex queries, “second phase” queries, results evaluation or ranking algorithms. This concept is able to bring the highest values in the domains like chemistry, medicine and law.

During the CIS exploration some definitions appeared:  collaborative information retrieval, social searching, co-browsing, collaborative navigation, collaborative information behavior, collaborative information synthesis.  My intention is to introduce some of them.

"Collaborative Information Seeking", Chirag Shah

1. “Collaborative Information Seeking”, Chirag Shah

Collaborative Information Retrieval (CIR) extends traditional IR for the purposes of many users. It supports scenarios when problem is complicated and when seeking common information is a need. To support groups’ actions, it is crucial to know how they work, what are their strengths and weaknesses. In general, it might be said that such system could be an overlay on search engine re-ranking results, based on users community knowledge. In agreement with Chirag Shah, the author of “Collaborative Information Seeking” book, there are some examples of systems where workgroup’s queries and related results are captured and used to filtering more relevant information for particular user. One of the most absorbing case is SearchTogether – interface designed for collaborative web search, described by Meredith R. Morris and Eric Horvitz. It allows to work both synchronously and asynchronously. History of queries, page metadata and annotations serve as information carrier for user. There had been implemented an automatic and manual division of labor. One of its feature was recommending pages to another information seeker. All sessions and past findings were persisted and stored for future collaborative searching.

Despite of many efforts made in developing such systems, probably none of them has been widely adopted. Perhaps it was caused partly by its non-trivial nature, partly by lack of concept how to integrate them with other parts of collaboration in organizations.

Another ideas associated with CIS are Social Search and Collaborative Filtering. First one is about how social interactions could help in searching together. What is interesting,  despite of rather weak ties between people in social networks, their enhancement may be already observed in collaborative networks. Second definition referred to provide more relevant search results based on user past behavior, but also community of users displaying similar interests. It is noteworthy that it is an example of asynchronous interaction, because its value is based on past actions – in contrast with CIS where emphasis is laid to active users communication. Collaborative Filtering has been applied in many domains: industry, financial, insurance or web. At present the last one is most common and it’s used in e-commerce business. CF methods make a base for recommender systems predicting users preferences. It is so broad topic, that certainly deserves a separate article.

CIS Barriers

Regardless of all these researches, CIS is facing many challenges nowadays. One of them is information security in the company. How to struggle out of situation when team members do not have the same security profile or when some person cannot even share with others what has been found? Discussed systems cannot be only created for information seeking, but also they need to  provide managing security, support situations when results were not found because of permissions or situations when it is necessary to view a new document created in cooperation process. If it is not enough, there are various organization’s barriers hindering CIS idea. They are divided into categories – organizational, technical, individual, and team. They consist of things such as organization culture and structure, multiple and un-integrated systems, individual person perception or varied conflicts appeared during team work. Barriers and their implications have been described in detail in document “Barriers to Collaborative Information Seeking in Organizations” by Arvind Karunakaran and Madhu Reddy.

Collaborative information seeking is exciting field of research and one of the search trend. Another absorbing topic is gamification adopting in IR systems. This is going to be a subject of my next article.

Customizing search in SharePoint Online

Search in SharePoint 2013 – Part 3: Customizing search in SharePoint Online

This post is the third in a series of four articles providing several best practices on how to implement and customise search in SharePoint. In the first post, we provided a brief overview of the differences in terms of search between the on-premise and cloud versions, and in the second blog post we discussed several things you should consider when migrating to the new SharePoint. In this post, we will mention several search features that can be configured in SharePoint Online, and we will be specifically be referring to those available in the Enterprise Plan.

Here is a summary of what customisations for search in SharePoint Online will be discussed:

  • Defining your own custom result sources, and hiding any that you are not using
  • Setting up hybrid search if you chose a hybrid solution
  • Defining which refiners to show and how to display them
  • Adding query suggestions that are related to your organisation
  • Adding query spelling corrections
  • Changing how the search results are displayed to show previews and additional metadata

Get ready to search ‘everything’

This is the uncustomized search box that you will see on your search center page.  Please note that in some SharePoint Online plans the ‘Videos’ vertical is not available.

This is the uncustomized search box that you will see on your search center page.
Please note that in some SharePoint Online plans the ‘Videos’ vertical is not available.

Everything is the default scope when performing a search in the SharePoint search center and is returning every type of result from all of your site collections. There are a few other scopes (search verticals, or so-called Result Sources) that are included by default, People, Conversations, and Videos, and these are preconfigured to search on what you would expect.

  • You can add new result sources, say for example Reports, that shows only search results that are tagged with the keyword ‘Final Report’. You define yourself what the criteria for a result source should be.
  • If there is a result source that you are not using, say for example if you have no video content and don’t plan to have in the near future, it’s less confusing for the users if you simply not show it for now. It’s easy to add it back if you will need it in the not so foreseeable future.

If you choose a hybrid solution, your content is split between the online SharePoint and the on-premise SharePoint Server.

  • It’s possible to have one search that displays results from both locations. For example, to show results from the on-premise installation in SharePoint Online, you have to define a new result source that is able to retrieve the results from the on-premise. Then you can configure the search results page to show results from both result sources (everything from SharePoint Online plus everything from SharePoint on-premise that matches the search query).

Screenshot from the post Hybrid search by the Microsoft SharePoint Team Blog showing how results from the cloud are integrated in the search results page when the user searches from an on-premises SharePoint 2013 site.  Notice also the new visual refiner for date interval in the refinement panel on the left.

Screenshot from the post Hybrid Search by the Microsoft SharePoint Team Blog showing how results from the cloud are integrated in the search results page when the user searches from an on-premises SharePoint 2013 site.
Notice also the new visual refiner for date interval in the refinement panel on the left.

Drill down into the search results

The search Refiners allow the users to drill down into the search results. There is a new type of refiner in SharePoint 2013, a visual refiner, by default used for the ‘Modified Date’.

  • The way in which the visualisation of the refiners is made has drastically changed, and you can define your own visualisation of the data if you want to. For example, what about a map as a refiner, instead of a list of city names?

By default, the refiners you will see would be the Result type (example values: Excel, Web page), Author (example values: John Doe, Jane Doe), and Modified Date (shown as a distribution of values).

  • If you edit the web part responsible for the refiners, you will be able to add other refiners as well. For example, company names are automatically extracted from your content, so it is easy to simply add that to your refiners.
  • Also, another useful refiner to show to your users is the Content Type, offering one level of detail more from the Result Type refiner.

Search guidance

Query suggestions are displayed as the user types.

Query suggestions are displayed as the user types.

As the user types a query in the search box, SharePoint is able to show Query Suggestions that help complete the query. SharePoint automatically creates a list of suggestions based on previous searches. When at least 6 search results are clicked for a specific query, that query will be added to the list of suggestions.

  • Besides the list that SharePoint creates automatically, you are able to add your own list of suggestions. This is especially useful when starting fresh with your installation, since a fresh installation will come with no query suggestions. You could help the users by adding your company name, product names or similar to the initial list of suggestions. You will also find manual adding of suggestions useful when reviewing the search logs, since these can give you a new perspective on what the users are looking for, and based on that input help guide your user to the relevant results using query suggestions.
  • You are also able to import a list of suggestions that are not intended to be shown in suggestions. Say for example that your testing team uses a specific keyword for testing content. In this case, it is very probable that the test keyword will soon appear as a suggestion for all users. To avoid this, simply add the keyword to the query suggestion exclusion list.

Similar to the query suggestions, another functionality whose purpose is to help the user in formulating the query is the Query Spelling Correction. An inclusion and exclusion list is used in this case as well, the only difference is that these are managed in the Term Store, while managing query suggestions is made by importing a plain text file.

  • You can add your own terms in the query spelling correction inclusion and exclusion lists. Probably one of the most often misspelled words is the word ‘business’. Or was it ‘bussiness’? After adding this term to the list of words to be included in the spelling suggestions, the correct form of the word would be shown under the ‘Did you mean’ functionality if the user misspells it.

Change how the search results are displayed

Screenshot from an Office Blogs post showing the hover panel for a PowerPoint document.

Screenshot from an Office Blogs post showing the hover panel for a PowerPoint document.

A final item on our list of proposed customisations for your search results is to change how the search results are displayed. In SharePoint 2013, it is the Display Templates that define how each element in the search results page is displayed. For example, there is a template for the refiner, another one for the hover panel of a PDF item, another one for the hover panel of a Word item, and so on.

  • A simple fix would be make sure that you have previews for PDF files in the hover panel. It is the Office Web Apps that power the previews for Office documents (such as Word, PowerPoint, Excel), but the preview for PDF files might not be visible for you. If so, what you can do is change the display template that is associated to the PDF result type.
  • You can also define what metadata to show for each result type. For example, for a Word document you would by default be able to see the Title, a text snippet and a URL, and in the hover panel the document preview, Last Modified date and author, as well as probably a list of the main headings from the document. However, if you have added additional metadata to your document, such as Location or Keywords, you can display these in the search results as well by modifying the right display template.

You can find more information about how to administer many of these search functionalities from this Microsoft Office page and from our search experts. Let us know how far you are in implementing SharePoint online for your organisation – we sure have a few more tips to how to configure and customize the search in SharePoint!

Cloud vs. on-premise SharePoint 2013 search

Search in SharePoint 2013 – Part 1: The difference between search within on-premise SharePoint 2013 and SharePoint Online

Cloud or on-premise? Findwise offers implementation and consulting services for both scenarios. This post is the first in a series of four articles providing several best practices on how to implement and customise search in SharePoint. The focus of this first post is introducing the difference between the cloud and on-premise SharePoint 2013 in terms of search features.

“The cloud is on fire”

That is a quote from the Microsoft Office General Manager Jared Spataro during his keynote at the SharePoint conference in Las Vegas last month. At this conference, Microsoft revealed that 60% of the Fortune top 500 adopted Office 365 in the previous 12 months. While new versions of on-premise SharePoint and Exchange Server are promised to still come next year, Microsoft is adding more and more capabilities to the cloud version.

SPC14 Keynote summary

Fun random facts about SharePoint Online presented during the keynote at the SharePoint conference in Las Vegas this year (March 3rd 2014)

In addition to the numbers above, a market analysis report done by The Radicati Group on the adoption of Microsoft SharePoint reveals that almost a quarter of the worldwide users accessing deployments of SharePoint made during the year 2013 are using the cloud based SharePoint.

When deciding whether to go for the on-premise or cloud solution, a go-to resource for your IT team is the TechNet article describing the availability of features across the solutions. That article not only divides the features between on-premise and cloud, but also between the different Office 365 and SharePoint Online plans. What is the difference? SharePoint Online is the cloud version of the SharePoint Server, but it can be deployed as a standalone service or as part of the Office 365 suite, so different plans are usually listed for these different scenarios. There are also the Office 365 Dedicated plans, but these are out of the scope for this article. The Microsoft Office site has a more business oriented comparison of the different plans, including pricing. If not decided for one or the other, there is also the possibility of a hybrid solution!

 Availability Search feature Office 365 Small BusinessOffice 365 Small Business Premium Office 365 Midsize BusinessOffice 365 Enterprise E1 or K1Office 365 Education A2Office 365 Government G1 or K1 Office 365 Enterprise E3 or E4Office 365 Education A3 or A4Office 365 Government G3 or G4 SharePoint Online Plan 1 SharePoint Online Plan 2 SharePoint Foundation 2013 SharePoint Server 2013 Standard CAL SharePoint Server 2013 Enterprise CAL
Available within all plans
Phonetic name matching Yes Yes Yes Yes Yes Yes Yes Yes
Expertise Search Yes Yes Yes Yes Yes Yes Yes Yes
Quick preview Yes Yes Yes Yes Yes Yes Yes Yes
RESTful Query API/Query OM Yes Yes Yes Yes Yes Yes Yes Yes
Result sources Yes Yes Yes Yes Yes Yes Yes Yes
Search results sorting Yes Yes Yes Yes Yes Yes Yes Yes
Ranking models Yes Yes Yes Yes Yes Yes Yes Yes
Query spelling correction Yes Yes Yes Yes Yes Yes Yes Yes
Refiners Yes Yes Yes Yes Yes Yes Yes Yes
Manage search schema Yes Yes Yes Yes Yes Yes Yes Yes
Available in all Office365 and SharePoint Online plans
Deep links Yes Yes Yes Yes Yes No Yes Yes
Event-based relevancy Yes Yes Yes Yes Yes No Yes Yes
Graphical refiners Yes Yes Yes Yes Yes No Yes Yes
Recommendations Yes Yes Yes Yes Yes No Yes Yes
Search vertical: “Conversations” Yes Yes Yes Yes Yes No Yes Yes
Search vertical: “People” Yes Yes Yes Yes Yes No Yes Yes
Query suggestions Yes Yes Yes Yes Yes No Yes Yes
Query throttling Yes Yes Yes Yes Yes No Yes Yes
“This List” searches Yes Yes Yes Yes Yes No Yes Yes
Query rules—Add promoted results Yes Yes Yes Yes Yes No Yes Yes
Avail. in Office365 Advanced Content Processing Yes Yes Yes No No Yes Yes Yes
Hybrid search No Yes Yes Yes Yes Yes Yes Yes
Query rules—advanced actions No No Yes No No No No Yes
Search vertical: “Video” No No Yes No Yes No No Yes
Not available in any of the Office 365, SharePoint Online plans
Search connector framework No No No No No No Yes Yes
Custom entity extraction No No No No No No No Yes
Extensible content processing No No No No No No No Yes

– Simplified view of the TechNet article, focusing on the search features availability across SharePoint solutions

Limitations in Office 365 and SharePoint Online plans

Is the cloud version good enough for your organisation when it comes to search features? The table above illustrates some of the things that you might be missing in terms of search, and in what follows we will discuss those whose availability varies amongst the Office 365 or SharePoint Online plans.

Query rules – advanced actions

In order to adapt the relevance of the search results to the user intent, SharePoint 2013 adds a new feature called query rules. A query rule is defined by a condition and a corresponding action to be taken when the condition is met. Within some SharePoint Online licenses, this functionality is limited to the possibility of adding promoted results, while more advanced actions are left out. The promoted results are similar to what was in previous SharePoint versions known as search keywords, or best bets, letting you promote specific results on top of the ranked search results. The more advanced actions could consist of for example changing the query or changing the ranking of the search results by promoting a certain group of results. You can read more about various usages of query rules in one of our previous blog post.

Search Connector Framework and Hybrid Search

Administrators of SharePoint Online will miss the feature of managing the different search connectors to content sources, since the search connector framework is not available. Only SharePoint content that is stored online is going to be indexed. Search results can only be retrieved from that content, or can be set up to retrieve from an Exchange Server, from a remote SharePoint, or from a search engine that uses the OpenSearch protocol. As an alternative approach to making content from other sources searchable, you can set up hybrid search. This feature is available in almost all Office 365 and SharePoint Online scenarios. It allows users to show search results from content available in the cloud and on-premise. So if you would like to index a content source that is not supported in SharePoint Online, you should be able to index it on the on-premise.

Custom Entity Extraction

The TechNet article describing features across solutions actually shows that this feature is only available with the enterprise licensing of SharePoint Server. This feature allows the extraction of custom-defined terms from your content and making them usable as search refiners. Say for example that you would like to extract all of your current product names from the content of your documents and then be able to refine your search results on the product name.

Content Processing Extensibility

The other search feature that is only available with the enterprise licensing of SharePoint Server is the content processing extensibility. In practice, this means there is an API that can be used to transform the data before it is stored in the index. For example, more advanced entity extraction can be made at this step. While the custom entity extraction discussed previously is able to identify names in the content based on a pre-defined list of names, through this API you can use a trained model to do entity extraction for example. Additional use cases could be cleaning or normalising the data according to predefined rules (for example, lowercasing all values in a property), or automatically tagging items based on the content.

It should be noted that the TechNet article is not a comprehensive list, and rather gives an overview of the major differences between solutions. Here is for example one more feature whose availability is limited.

Synonyms

One of the missing features in SharePoint Online that is available in the on-premise solution is the possibility of defining synonyms. Since it’s too easy to communicate the same thing with different words, defining synonyms or abbreviations for search phrases can help aggregate the results for the multiple ways of expressing the same information need. We hope that Microsoft will integrate this feature in the future versions of SharePoint Online as well.

Find the right documentation

When searching for which functionality is available across solutions on the Microsoft Office.com website or TechNet, make sure to check that the discussed functionality applies to your version of SharePoint. Articles usually indicate for which versions the functionality applies to.

Feature availability in MS articles

Articles on Office.com (left) and TechNet (right) indicate for which version
of SharePoint the discussed topic applies to.

Please note that things might change, new updates in SharePoint online can add functionality that was missing before. To stay up-to-date, check the TechNet page once in a while, or contact us to help you map your requirements to the available search features across solutions.

Event driven indexing for SharePoint 2013

In a previous post, we have explained the continuous crawl, a new feature in SharePoint 2013 that overcomes previous limitations of the incremental crawl by closing the gap between the time when a document is updated and when the change is visible in search. A different concept in this area is event driven indexing.

Content pull vs. content push

In the case of event driven indexing, the index is updated real-time as an item is added or changed. The event of updating the item triggers the actual indexing of that item, i.e. pushes the content to the index. Similarly, deleting an item results in deleting the item from the index immediately, making it unavailable from the search results.

The three types of crawl available in SharePoint 2013, the full, incremental and continuous crawl are all using the opposing method, of pulling content. This action would be initiated by the user or automated to start at a specified time or time intervals.

The following image outlines the two scenarios: the first one illustrates crawling content on demand (as it is done for the full, incremental and continuous crawls) and the second one illustrates event-driven indexing (immediately pushing content to the index on an update).

Pulling vs pushing content, showing the advantage of event driven indexing

Pulling vs pushing content

Example use cases

The following examples are only some of the use cases where an event-driven push connector can make a big difference in terms of the time until the users can access new content or newest versions of existing content:

  • Be alerted instantly when an item of interest is added in SharePoint by another user.
  • Want deleted content to immediately be removed from search.
  • Avoid annoying situations when adding or updating a document to SharePoint and not being able to find it in search.
  • View real-time calculations and dashboards based on your content.

Findwise SharePoint Push connector

Findwise has developed for its SharePoint customers a connector that is able to do event driven indexing of SharePoint content. After installing the connector, a full crawl of the content is required after which all the updates will be instantly available in search. The only delay between the time a document is updated and when it becomes available in search is reduced to the time it takes for a document to be processed (that is, to be converted from what you see to a corresponding representation in the search index).

Both FAST ESP and Fast Search for SharePoint 2010 (FS4SP) allow for pushing content to the index, however this capability was removed from SharePoint 2013. This means that even though we can capture changes to content in real time, we are missing the interface for sending the update to the search index. This might be a game changer for you if you want to use SharePoint 2013 and take advantage of the event driven indexing, since it actually means you would have to use another search engine, that has an interface for pushing content to the index. We have ourselves used a free open source search engine for this purpose. By sending the search index outside the SharePoint environment, the search can be integrated with other enterprise platforms, opening up possibilities for connecting different systems together by search. Findwise would assist you with choosing the right tools to get the desired search solution.

Another aspect of event driven indexing is that it limits the resources required to traverse a SharePoint instance. Instead of continuously having an ongoing process that looks for changes, those changes come automatically when they occur, limiting the work required to get that change. This is an important aspect, since the resources demand for an updated index can be at times very high in SharePoint installations.

There is also a downside to consider when working with push driven indexing. It is more difficult to keep a state of the index in case problems occur. For example, if one of the components of the connector goes down and no pushed data is received during a time interval, it becomes more difficult to follow up on what went missing. To catch the data that was added or updated during the down period, a full crawl needs to be run. Catching deletes is solved by either keeping a state of the current indexed data, or comparing it with the actual search engine index during the full crawl. Findwise has worked extensively on choosing reliable components with a high focus on robustness and stability.

The push connector was used in projects with both SharePoint 2010 and 2013 and tested with SharePoint 2007 internally. Unfortunately, SharePoint 2007 has a limited set of event receivers which limits the possibility of pure event driven indexing. Also, at the moment the connector cannot be used with SharePoint Online.

You will probably be able to add a few more examples to the use cases for event driven indexing listed in this post. Let us know what you think! And get in touch with us if you are interested in finding more about the benefits and implications of event driven indexing and learn about how to reach the next level of findability.