Quarks


Introduction

With the advent of new decade, as hardware gets modern (like Flash disk instead of magnetic ones), the concept of database and web-services needs to be modernised as well.

So, what's the problem with traditional systems and databases?

Relational Database representation compared to client side (usually Object Oriented) representation are completely different. To overcome this No-SQL was introduced like mongo, couchdb etc were introduced. However, even then they were developed keeping in mind the magnetic disks and were not tuned for optimal usage for new hard disk systems like flash disks. One such optimized store for new era hardware is rocksdb, which had the benefit of being developed later. Similarly we need to come up with new concepts which makes both server and client side development easy as far as data management and services are concerned.

In comes the concept of Quarks (the name originally given by one of my esteemed colleagues Dr. Russel Ahmed Apu). Quarks plan to provide a uniform structure to address architectural problems, a proposed step in the right direction to modern software development. As micro-services concept gain popularity, there needs to be a mindset and attitude change towards server side programming. Having a separate heavy-weight stand alone db will not be a feasible solution in the upcoming decades; hence Quarks concepts can be considered a paradigm shifting solution towards how services are written.

Philosophy

Quarks serves as a small lightweight easily distributable service which eliminates the need of writing a lot of apis in the server side. There is also no need to create data models in servers in simple scenarios and having to link them to a separate stand alone DB. At the heart of quarks is the concept of simplicity. Programming in modern era should be simple - the program shouldn't have to worry about traffic management, threads, scaling and distributing of the system when the need arises.

For scaling - simply replicate the Quarks Servers, put a standard load balancer (like nginx) in front to distribute api calls. The Quarks services will communicate with each other and interchange data if needed.

Why it is named Quarks is because quarks services act like small particles (read light weight micro-services) and create a large system eventually! Getting it up and running is as easy as dropping an executable in hosting-server and running it.

A majority of modern apps now a days has to deal with data and Quarks provide a mechanism to cache, store, retrieve and operate on this data fast (taking advantage of modern hardware) with clever querying techniques.

It is probably better to explain the usage of Quarks with a real life example.

Usage

Before moving on to the example, here is the essence in two lines:

  1. User-->[Quarks.Store] ->ThreadManagement->[Cache/Ram]->Queue->PersistentStorage
  2. User-->[Quarks.Query]->ThreadManagement->fetch to [Cache/RAM]->return

Will discuss how the scaling happens for huge data in a bit after we go through a use case scenario.

User Story - A public chatting system

Step 1 : User types name and asl (age sex location) and
choose from a list of chat channels and clicks join.
Call goes to nodejs/php server through api or socket.
server generates a user_id and assigns a user to a channel

Step 2 : User sends some messages to a specific channel

Step 3 : On re-entry user can see all messages for that channel.

We address these steps as follows:
(Using diagrams to illustrate the solution)

Step - 1

Step - 2

Step - 3

When I talked about clever querying, have a look at how using wild card search it is possible to retrieve the desired info from loads of data.

Needless to mention that all the saving, server hits, request queuing, traffic handling, thread management is now a part of the Quarks system which takes the headache.

Now on to scaling and having a distributed system..

As the traffic grows we can create new instances of the Quarks instance (let's call them core) and these can be controlled by a multicore/multi-instance manager. Illustration below:

Distributed Quarks

Basically, the idea is having a lot of light weight easily drop-able servers (calling them Quarks Cores) and an adequate balancing server (planning to name it Boson). Once the Boson is running, the cores would be able to talk to each other and fetch a lot of results very fast!

So, how does the fetchMessages for a multi-core system look like? Not too much different from what you saw in Step-3. fetchMessages invoke a find query on Quarks.

going multicore

Again, an illustration on how Quarks is working on the query internally:

First, the find request is carried on to the multicore manager which publishes the request for all of the listening cores to process:

fetchMessages multicore

Finally, as results become available, the manager aggregates them and returns to the requesting core which was invoked by the client.

Return results

Point to note - in client side coding we will hardly notice the difference.

Some key features provided by Quarks:

1) Persistence of data - Quarks will provide a mechanism , where after a certain amount of memory / cache is filled, it will send batch data for serialisation/ dumping to database for later retrieval. That reduces the hits to persistent storage. However, it will provide a mechanism where it can dump to persistent storage immediately if required.

2) Querying on data - Quarks will be able to query on the value (as well as key) through ORM style queries and apply business logics on the queried data.

Sample Query Format for "querying items which are up for sale with key like item* (i.e item1, item2 etc.) , then find the sellers of such items (items has a seller_id field that contains the user_id of the seller) "

{
    "keys":"item*",
    "include":{
        "map": {"field":"seller_id", "as":"seller"},
        "module":"main",
        "filter":"jsFilter",
        "params":"{\"approved\":1}"
    }

}

Here module main is the main.js file residing in the server in the same path as the executable. function is the name of the JS Function which we will use to further filter the data. The idea is the mentioned script main.js will have a filter function with a predefined form filter(elem, params), or a sort function with predefined form sort(elem1, elem2, params) to further fitler/sort the data. 'elem' is an individual item (one of many) found by the Quarks lookup through "keys":"item*" . We are invoking the JS module and the function while finding and iterating the matching items in C++. It is up to the user to interpret the params in the server side and write the script codes accordingly. In our example, we named the function - "jsFilter" in main.js. Quarks will allow minimum usage of scripting to ensure the server side codes remain super optimized.

Scripting languages like javascript will be allowed to apply business logic in the queried data. We would be able to use plugins too for computation and calculation.

3) Sorting - Sorting will be done based on both query and applying server side logics if necessary through scripts/plugins.

4) Expiry - Efficient data expiry mechanism will be provided in the server side.

Lastly, Quarks can reside on the same memory space along with business logics by means of simple scripts or plugins (dynamically loaded libraries) without the need of having a separate server for storing data. Lightweight servers quering on the data of the same memory space will make results retrieval much faster than traditional systems.

Also notice the difference in approach:
Traditional Framework =>
client->api gateway->backend server logic->makes query to a separate server db (latency occurs)->fetches data->works on the fetched data->produce results->sends back to client
Quarks Framework =>
client->api gateway->quarks data lookup(apply business logic while lookup)->sends data back to client
..reducing round trip to separate servers and eliminating a few steps in between.

Quarks is up and running!

Try the browser based Quarks Editor:
Editor

Technology

a) C++ Crow Webserver for serving client requests
b) Good JSON Parsing C++ Library (not decided yet, probably will go with the one provided in Crow)
c) rocksdb and
d) v8 javascript engine

ZeroMQ to be integrated for internal communications between Quarks Cores.
Code repo to be found here: https://github.com/lucpattyn/quarks

Just to re-iterate one more time, Quarks is a system as well as a philosophy and set of concepts and guidelines (ex. longer reads, short bursts of writes .. more on that in due time) to make modern day programming easy and simple.

Updates: 

1. 2019-05-06 - We have now got the first draft of Quarks uploaded in git (https://github.com/lucpattyn/quarks)

2. 2019-07-02 - Rocks DB integrated

3. 2019-08-22 - Quarks now supports filter queries in the form of
{ "keys":"item*", "filter":{"map": {"as":"seller", "field":"seller_id"}} }
in plain English :
query items which are up for sale with key like item (i.e item1, item2 etc.) , then find the sellers of such items (items has a seller_id field that contains the user_id of the seller)

4. 2019-10-20 - Scripting integration through v8 Engine

5. 2019-11-01 - Quarks Console is up and running for testing (http://api.quarkshub.com/console)

However this should not be used anymore

2021-07-17 The console will be replaced by the Editor at same url

Slowly but surely we are getting there :)


Signing off,

Mukit