Spinque's 10 Year Anniversary

Posted on 03/06/2020 by Roberto Cornacchia.

On 28 May 2020 we celebrated our 10 year anniversary. We had planned to have a nice party in our hometown of Utrecht, but unfortunately live events are out of the question at the moment due to the Covid-19 pandemic. Instead, we invited our closest clients and partners, who helped define what Spinque has become today, for online drinks and a moment of reflection on the road we have taken together - and on the road ahead.

A research PoC for patent retrieval

How did it all start? As it is often the case, it took the right combination of people, ideas, and events. When they all fit together like in a jigsaw puzzle, the magic happens.

10 years younger than now, Arjen, Wouter and myself, Roberto, were doing research at CWI in Amsterdam, which hosts top-level research groups in database architectures, information retrieval, and semantic web, continuously feeding each other. It was the perfect setting for encouraging the study of an integrated approach to 'Search'. These were the questions we were dealing with:

'Can we apply information retrieval algorithms and semantic web reasoning to a knowledge graph, leveraging the computing power of a database system? Can we do that in a human-friendly way?'

We thought we had some good cards up our sleeves, and the perfect occasion to prove ourselves came during some informal chat at a conference.

Our conversation party was building a large and highly enriched database of patent documents, and our research interests seemed to match perfectly the complex task of patent retrieval.

Patent documents offer a unique mixture of challenges and handles when it comes to information retrieval. To mention a few: they are highly structured, with several sections and subsections; they are multilingual; they incorporate several writing styles in one document (e.g. legal and technical sections); they are interlinked; they constitute a fairly large body of data.

We were excited about the challenge and the first PoC was ready in a matter of days. It had no graphical interface, only a command line interface. Despite the very different look, it already encapsulated the essence of what would become Spinque Desk as we know it today: an environment to build Search Strategies out of various Building Blocks. Our software then transformed these search strategies into actual search engines, a way of work that is exactly what we offer now in Spinque Desk.

Things already had the right shape. We had a system that allowed us to answer questions about data by putting together operational blocks capable of performing typical IR tasks by means of database operations.

We 'just' needed to make it better - much better ;)

The first "real thing"

The no-gui, no-real-database PoC served its purpose well and gave us the right motivation: now we knew we were on to something with much potential.

The next prototype was aimed at being 'the real thing'. It featured:

  • a data indexing pipeline for XML data into a triple-based database schema
  • a graphical interface to draw Search Strategies on a canvas, by visually dropping building blocks and connecting them
  • a building block library comprising filtering and ranking operations that are essential when building search algorithms of all shapes and sizes
  • an execution engine that could translate the drawn search strategy into probabilistic database queries
  • integrated result inspector and faceted result exploration

The initial research challenge - patent retrieval - was ready to be tackled.

This has been a tremendously inspiring period that we remember with great pleasure. On the one hand, the excitement of working together on such an interesting software engineering challenge. On the other hand, the chance to learn a great deal from the case of patent search.

Of course we needed to obtain some impartial confirmation of the quality of our work. We participated for two years in a row, 2010 and 2011, in the PatOlympics, an international interactive evaluation platform organised by the IRF (Information Retrieval Facility, Vienna) where prototypes of state-of-the-art patent retrieval systems were tested and driven to their limits by actual patent retrieval professionals. We won the gold medal both years, with a score up to 5 times the one of the second best. This was an important validation not only of the software, but of the approach as a whole: the limited effectiveness of standard ranking algorithms (in comparison to those of contenders) is turned into superior final ranking strategies when search professionals can take the ranking strategy into their own hands, during the search. Overall, patent search with Spinque led to many more relevant items found than any competing (inflexible with respect to ranking) system.

Spinque was born

Did we dare take our scientific achievements to the commercial world? Did we really believe in what we had created? It didn't take long to answer that question. This was something that deserved a chance.

We decided to establish a company: 5F.

'Hold on - did you say 5F?'

Not-so-well-known fact #1: the first name that we had in mind for our company came from the phonetic emphasis on the letter F from the driving principles of our work so far:

  • Flexibility
  • Effectiveness
  • Efficiency

5 'F's in total.

Finding the right balance among those driving principles is a sort of holy grail of software engineering. We liked this name because it was meant to be a continuous reminder of those principles.

We are still attached to this first name (Arjen still uses mail folder 5f to archive email conversations involving Spinque!), which we have only unofficially used for a few months, and we still strive to adhere to the three founding principles. But, as you know, we eventually called our company Spinque.

The first focus shift - from software to client

Not-so-well-known fact #2: 'Spinque' is the contraction of 'Spin your own query'.

The name popped up during one of the many brain-storming sessions and it immediately resonated with us all. We all liked the catchy sound of 'spin' and the presence of something database-related (query).

However, as we would very soon come to realise ourselves, the essence of Spinque lies really in the most hidden part of the name: 'your own'.

Spinque was really born when we explicitly acknowledged that initial intuition behind the words 'your own'. Making good software according to the 5F principles was only part of the story, and, in a way, a byproduct of our core mission:

'to help our clients leverage their unique expertise to create the best search solutions for them'

Design - Implement - Evaluate - Learn; Repeat

The next few years were, again, so intense! A lot of hard work, and so many satisfactions and so much to learn from each of our clients and their projects.

We expanded our team with talented researchers and professionals, explored many new technical solutions, and offered professional services with higher and higher standards.

The software architecture was improved several times, from the initial monolithic approach, to a solid 3-tier one, until the cloud-ready solution of more recent years.

The building block library was extended with many powerful blocks for state-of-the-art information retrieval, and graphical as well as semantics improvements were introduced to maximise usability.

The representation of our probabilistic knowledge graph was extended to make it more compatible with the widely adopted semantic web standards. This triggered a virtuous circle with many exciting projects, especially in the domains of cultural heritage and public administration, that could benefit more easily from our unique approach and feed back invaluable input for further development. CultuurLINK, a free online vocabulary alignment service for cultural service institutions, is one example of the many outcomes of this virtuous circle.

Efficiency was also greatly and continuously improved during the years. Today, Spinque can serve a high number of concurrent tailored search requests over several billions of triples in a matter of milliseconds. This was achieved thanks to careful optimisations, thorough analysis of workloads, but also thanks to the unparalleled performance delivered by the columnar database system MonetDB, on which we have always invested a considerable part of our R&D efforts.

Working in close collaboration with our clients and partners has had a crucial role in honouring the initial 5F principles as well as the higher mission of assisting them with the development of tailored search solutions.

The second focus shift - from client to community

What is the point in learning good lessons, if they don't change us for the better? Years of experience in working with organisations to implement tailored search solutions have taught us much. But more importantly, they have uncovered common patterns.

What questions did our clients ask? Which questions did we ask them? What were the first steps? Which were the most common setbacks? Which were the most common solutions?

For the past year we have been busy bringing all these experiences and lessons together in our design approach 'Search Design'. We had planned to present our take on 'Search Design' to you on the occasion of our 10 year anniversary. Given the limitations of online meetings, we decided to stick to a cheers moment online for now. We hope to present our vision live in Utrecht at a later point in time, as soon as the pandemic allows us. We look forward to the next ten years, where we will continue to learn from each other how to create the best search experiences, together.