June 16, 2024

Structured Query Language for the web

Do you wish you could use a single data-query endpoint from your SPA?
Wouldn't it be great to build your data query step by step?
Would you like a SQL, but for Web?

In the previous blog we made the case for using relational sql database. We also discussed differences between vendors and deployments.
And we’ve looked at common code for working with SQL.
One big piece was missing and this piece might be a reason why NOSQL databases are popular - how to query/read/select data.

This blog post will introduce a json based structure query language, json-fetch.

Motivation

Why and when do we need it?

Api explosion

All apps that show the user a list of items will eventually need sorting and filtering. Even a simple product list app needs complex filtering: by size, description, availability, category, date. Then consider adding user comments, ratings, similar products, vendors. If we mix and match the filtering options, we need dozens maybe hundreds of queries.
And with new products, we will be adding more all the time.

Can we build a backend system without parametrized queries - yes, but it will be a massive pain to maintain. And with a lot of duplicated code, leading to bugs and security issues.

The moral of this story is: even conceptually simple apps, often need hundreds of data-queries.
Building a server API for each query is unrealistic and non-maintainable.

Instead, dynamic queries will be needed.

We argue, that instead of multiple apis, with each having a different ways for filtering, ordering and linking data, a single query api endpoint with a well defined query language should be used.

This creates a single place to audit, to secure, to log, etc.

Ultimately, this leads to less code, fewer bugs and easier maintenance.

Json structure

We want a query language, where the query can be manipulated with code easily. To add an attribute to the result set, we push an object to the attributes array, to add a condition, we push an object to the conditions array.
The JSON notation is easy to write by hand, and the queries look natural in a JS file. Moreover, with Typescript we also get a level of type checking.
And we can write simple methods to manipulate and compose queries, for example toggling various user filters.

Finally, administrators, system customizers or advanced users can build their own queries with visual tools.

Low overhead

The query is easy to translate to SQL, because it uses the same concepts. It is a simplified SQL dialect written as JSON. What we want is a single translateToSql call that will validate the query and generate the parametrized sql SELECT statement. The results of the SQL query can be directly piped to the api caller.

Security

Because the query is simplified SQL we can actually audit and validate the query and all the parameters. The json structure makes it easy to add additional data filtering according to security rules at the api endpoint.

Alternatives

What are the available dynamic (client side) query languages?

GraphQL - widely used and rather easy to write and understand. Requires additional libraries to manipulate in code. Used by Shopify, Gatsby and many others. The project is open source.
Odata (2,4) - a rather simple rest-based api. The problem is that the really nice features, like advanced filters, json output and querying related entities are not widely supported, or are vendor dependant. Many tools are stuck with OData 2.0 which is XML based and a far cry from OData 4.0
SOQL - a safer SQL by Salesforce, vendor specific, but has some good ideas.
FetchXml - Microsoft Dynamics query language. It is vendor specific and xml, but the structure is well thought out. And it is the spiritual ancestor of Json Fetch.

Json Fetch

Let's look at Json Fetch in more detail. We will start with declarations.

export interface Fetch {
	entity: FetchEntity;
	page?: number;
	pageSize?: number;
	count?: number;
}
export interface FetchEntity {
	name: string;
	attributes?: FetchAttribute[];
	filter?: FetchFilter;
	links?: FetchLink[];
	orders?: FetchOrder[];
	allattrs?: boolean;
}
export interface FetchAttribute {
	attribute: string;
	entityName?: string;
	alias?: string;
	aggregate?: "count" | "countdistinct" | "min" | "max" | "sum" | "average";
	groupBy?: "column" | "year" | "quarter" | "month" | "week" | "day";
	// quarter, month, week -> will group by year first implicitly.
}
export interface FetchOrder {
	attribute: string;
	alias?: string;
	descending?: boolean;
}
export class FetchFilter {
	type?: 'and' | 'or';
	conditions?: FetchCondition[];
	filters?: FetchFilter[];
}
export interface FetchCondition {
	attribute: string;
	entityName?: string;
	operator: FetchConditionOperator;
	value?: string;
	values?: string[];
	valueLabel?: string;
}
export type FetchConditionOperator =
	"eq" |
	"ne" |
	"gt" |
	"lt" |
	"ge" |
	"le" |
	"on" |
	"null" |
	"not-null" |
	"like" |
	"not-like";

export interface FetchLink extends FetchEntity {
	from: string;
	to: string;
	type?: "inner" | "outer";
	alias: string;
}

Let's unpack it.

Fetch

The fetch json is the top query object. There are a few query wide configuration options:

page and pageSize: for paging resulting records. If page is 4 and page size is 50, the results will start at 200 and 50 results max
Count: maximum number of records (when paging is not used)

Entity

Then there is the Entity object that defines the primary database table. The SELECT… FROM entity. It contains properties for the common SQL query part:

select attribute list
where - the filter object
order - the orders list
links - a list of joined entities

Select list contains simple objects, where only the attribute is required. The rest of the properties are used in advanced scenarios.

Filter

The filter object is a little peculiar, it has a list of conditions that must all be true - and type or at-least one is true - or type. If a combination is needed the filter has a list of sub-filters. This might take a while to get used to, but it makes the common cases easy to express.

Condition

The condition object is has 3 main and one optional properties. It represents a simple table-column (operator) value expression. The attribute - database table column that the filter applies to. The entityName - option alias of a linked entity. This is especially useful for expressing filters where the linked entity (joined table) column is null. Operator - one of the operators that map straight to sql or add a little logic: null, not-null, eq, ne, lt,gt,le,ge Value - the actual text, number or date the column should be compared to. Numbers are serialized with JS.toString, Date values as toISOString().

Order

Order is an array of objects with the attribute name and whether to sort ascending or descending. If ordering by a linked entity (joined table) is needed the alias property has to be set.

FetchLink

FetchLink object extends the Entity object and adds: The type of link (whether an left outer or inner join will be used) The alias of the link - defines the prefix for returned attributes and can be used in the entityName of filter conditions from, to. The join conditions: from is an attribute if the parent entity, to is an attribute of the linked entity.

Putting it together:

let q = {
	entity: {
		name: "customers",
		attributes: [{ attribute: "name" }],
		filter: {
			conditions: [{ attribute: "name", operator: "like", value: "inu %" }]
		}
	},
	orders: [{ attribute: "name" }],
	links: [{
		name: "contact", from: "primarycontactid", to: "id", type: "outer", alias: "k",
		attributes: [{ attribute: "name" }, { attribute: "emailaddress" }, { attribute: "phonenumber" }],
	}
}

Compared to the SQL the JSON is not shorter.

SELECT c.name, k.name, k.emailaddress, k.phonenumber
FROM customers AS c
WHERE c.name LIKE "inu %"
LEFT JOIN contact AS k ON (c.primarycontactid = k.id)

However, it should be obvious that the json is much easier to manipulate.
Also, since it is rather small SQL subset, it doesn’t allow less used or straight-up esoteric sql features.
These can be added as needed for example as custom condition operators.
What's more, since we can inspect the query at the server api endpoint we can refuse some conditions for some users (roles) or add other arbitrary rules.

The power of having full control over the query with standard JSON simply cannot be overstated.

In our apps, the base query is defined in app configuration, but after that several client side components further augment the query.

Search text box, might add a search filter,
Grid sort headers might override the sort order,
Quick-filters can optionally add further filters or links with user input, such as category, size etc.

And for admin and advanced users, we also provide a visual query builder tool. Once the query arrives at the backend, additional filters and links are added based on the calling user’s permission set.

Finally, having json-fetch makes our queries portable. We can augment the sql generation to take advantage of a particular sql engine features. In our apps, the translate layer will output either MSSQL or SQLite dialect.

Wrap

We hope we conviced you that dynamic query can provide massive benefits.
Compared to the "general wisdom" of using non-parametrized queries, they can be much safer and easier to maintain - if done well.

In the next blog we shall look at the validation and translation methods.

Until then, if you have any questions, suggestions or corrections, please let us know!

Happy hacking!