You are here

Add new comment

Workflow & architecture

Design, Publish, Search!

 

Strictly speaking, Spinque is not a search engine, but rather a search engine generator.

With other tools, data is fed into the system, and the readily available search options can be exploited. If the search process you had in mind cannot be expressed with such systems, a considerable programming effort is necessary for their customisation, when this is at all possible.

With Spinque, there is no pre-defined search engine. You can design your own search engine, following the logical process that seems more reasonable to you. The result of this design (which happens via a graphical interface) is what we call a strategy. Strategies can be saved, changed, shared, optimised and ultimately published as stand-alone search engines. The following drawings summarise how to work with the Spinque platform and how this relates to its software architecture.
 

Spinque Workflow

Spinque platform - Workflow

 

 

The typical workflow with the Spinque platform is that you create an index from the data, you design the desired search strategies to access the data and publish them as search engines on the Web. Modifying the search strategies also modifies the way users search your data. 

The steps in the workflow relate to the architecture components (in bold), as follows:

  • An index for Data is produced using an indexer and fed into a relational database (RDBMS)
  • Search strategies can be designed using the Strategy Editor
  • Search strategies can be published as search engines using the Spinque Management Console
  • Published search engines are available to users via fully customisable search UIs
  • Every communication among search UI, database, Strategy Editor and Spinque Management Console happens via the (automatically generated and therefore "dynamic") REST API

 

Spinque Architecture

Spinque  platform - Architecture

 

Below, we describe briefly each of the architecture components. Upcoming posts will provide more in-depth information.

 

Data

Spinque can work with data coming in a variety of formatsXML is natively supported in Spinque. Many other formats are supported and converted into XML under the hood when necessary, including HTMLSQLJSONCSV files, PDF and office documents, RDF documents.

 

Data containers

Spinque's flexibility in handling data is enriched by data containers, which define how data is accessed. These include filesystem files and directories, data archives such as TAR and ZIP, SVN repositories, RSS feeds and Open Archive Initiative Protocol Metadata Harvesting (OAI-PMH). Containers can be grouped and chained together to allow extra flexibility. For example, documents from a filesystem can be combined with RSS feeds into a single data input source for the system.

 

Data templates

Every data collection comes in different "sizes & shapes", not only in terms of data formats but also in terms of internal structure. Data templates instruct the system about where to find relevant pieces of information within the data at hand. While it is possible to use data-agnostic, zero-configuration data templates for simple search operations like full text search, more focused operations require some detailed knowledge on the data being searched.

As an example, consider a collection of Wikipedia pages. If we planned to support queries for the navigation of the "See also" section, we would need to tell the system how the database entry (Linux, see also, List of operating systems) can be derived from the analysis of the Wikipedia page about the Linux operating system, that is, where to find such links within the document structure.

Such focused data description is made easy by Spinque-customised XSL stylesheets, which contain definitions such as:

<xsl:template match=xpath_to_seealso_section/item>
  ..
  <spinque:relation subject=current_page predicate="see_also" object=related_page/>
  ..
</xsl:template>
This is all the system needs to know for making the "See also" section a searchable predicate in the database (see the section about the database below).
 

Indexer

The indexer is a piece of software that takes data containers and data templates in input and populates a relational database with an index to make data searchable. It is designed to achieve high data throughput, albeit dependent on the complexity of the data templates used.

 

RDBMS

Spinque relies on a Relational DataBase Management System as a central repository for storing and querying all the meta-data and indices that are necessary for its operations. Relational database technology guarantees a solid ground for a robust and efficient data management layer. Spinque can talk to any RDBMS, but is optimised to work at best with MonetDB/SQL, a high-performance, column-store database engine that excels in heavy analytical tasks on large data.

Later posts will talk more specifically about the nuts and bolts of how data is stored and crunched inside the database. We can however highlight the following interesting aspects that make Spinque platform unique:

  • Spinque uses an RDBMS for classical field-based selections/aggregations, etc (no surprise here!)
  • Spinque uses an RDBMS to navigate an RDF-like (subject, predicate, object) network of facts
  • Spinque uses an RDBMS to implement full-text search based on inverted file-like index structures
  • Spinque uses an RDBMS to implement hierarchical search primitives typical of XQuery systems
  • Spinque uses an RDBMS to support probabilistic reasoning over all indexed data

 

SpinQL

SpinQL (pronounced as ' spinkle' ) is a domain specific language designed to bridge the gap between the expressive power of Spinque's Search by Strategy concept and the the unfriendly efficiency of SQL queries to the RDBMS. It is used internally to describe in full detail the strategies that can be designed visually using the Strategy Editor. While it is not necessary to even know of SpinQL's existence in standard scenarios, skilled users can optionally use it to create more customised search primitives. 

 

REST API

All Spinque components communicate via a REpresentational State Transfer (REST)  API available through a Web service. REST commands are issued over HTTP as simple URIs. The simplicity and usefulness of this layer can be best illustrated with an example:

http://rest.spinque.com/app1/q/strategy1/p/price/500/p/buildYear/1970/results

Queries the app1 project, using strategy1, with query fields price=500 and buildYear=1970. Any custom-build search UI can issue queries and receive answers to a Spinque server, as long as it is able to construct such URIs.

 

Strategy Editor

The Strategy Editor is a Web service that allows to design search strategies (remember: Don't program search engines, design them!). Simply drag the available building blocks into the design space and connect them, to define how data flows from one operation to the next one. Define which parameters will be customisable by end-users. Gain full control on the search process by inspecting the intermediate results at any block in the designed strategy.

When your strategy reflects the search experience you want to offer on your data, save it and publish it as an independent end-user search engine. How? By using the Spinque Management Console.
 

 

 

Spinque Management Console

The Spinque Managegement Console is a Web service that implements a dashboard from which you can control workspaces, databases, strategies and more. You can connect one or more databases to the workspace that represents your search application, and select which search strategies you want to publish as stand-alone search engines for end-users.
 

 

 

Search UI

As soon as a search strategy is published, the search engine and underlying REST API are generated, and ready to be queried. Thanks to the REST interface, the UI design is completely independent from the Spinque platform and can be designed by any third party. This screenshot shows a simple UI designed by Spinque (we are no UI experts!) for a demo search application related to Dutch political data. Upcoming posts will show how to build a simple UI for Spinque in just a few minutes.

 

Demo UI

 

 

(If you're a human, don't change the following field)
Your first name.
(If you're a human, don't change the following field)
Your first name.
(If you're a human, don't change the following field)
Your first name.

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.

Wysiwyg public

  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd> <p> <br>
  • Lines and paragraphs break automatically.
  • Web page addresses and e-mail addresses turn into links automatically.

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer

Uncaught exception thrown in session handler.

PDOException: SQLSTATE[HY000]: General error: 5 database is locked: INSERT INTO {sessions} (sid, ssid, uid, cache, hostname, session, timestamp) VALUES (?, ?, ?, ?, ?, ?, ?); Array ( [0] =&gt; roJto_fTGIZodUzhFpdlYvyfKTYbGkfVf2uSYyQ-OcU [1] =&gt; [2] =&gt; 0 [3] =&gt; 0 [4] =&gt; 54.198.202.148 [5] =&gt; botcha_session|s:43:&quot;roJto_fTGIZodUzhFpdlYvyfKTYbGkfVf2uSYyQ-OcU&quot;; [6] =&gt; 1397990812 ) in _drupal_session_write() (line 209 of /export/data1/www/sites/spinque01.spinque.com/html/v3/includes/session.inc).