Working with Virtually Related Data
by Terence Bennett • July 6, 2018
It is hard to believe it has been two years since we blogged on working with related data in the DreamFactory REST API. While we have continued to significantly enhance the product since then, we wanted to tell you about a new feature that builds on top of this previous blog. Since its inception, the DreamFactory REST API’s SQL DB Service has had the ability to combine a single table’s result set with related tables’ result sets in a single API call, both for queries and for data management i.e., CRUD operations. DreamFactory not only learns the existing relationships i.e., based on foreign keys, from a database connection, but also allows manipulation of the relationships in a database schema via the API. If any of this is news to you or not familiar, please refer to our earlier blog.
But what if you have data sets that are related (or used in an application as such) but those relationships are not detailed in the schema of the database, or the data exist on different databases or even different servers? Currently, the solution is to use some server-side scripting, or make multiple calls from the client-side and pull the information together as needed.
Virtual Relationships, aka DataMesh
Introducing what we call DataMesh, a new feature built into the SQL DB Service that allows extending the relationship concepts already available, but to "virtual relationships". A virtual relationship is one whose foreign key constraint isn't discovered from the database schema but is provisioned via the API. These virtual foreign keys are provisioned in the table schema and stored locally as part of the system tables and applied to the discovered schema and used by table record operations for data retrieval and provisioning. So now admins can use the API to provision virtual relationships from tables to tables on the same service (i.e. the same database connection) or to any other database service provisioned on the DreamFactory platform. These relationships are then exposed in the table records API and used just like any other existing relationship.
An Inter-Service Example
Say I have a legacy SQL database that contains a table for tracking contacts (let's call it contact). It has the typical user data including email address, name, etc. Over the years we have added other apps that track e-commerce customers, whose schema and data are potentially imported into the legacy database, one piece of which is a "username" field (required to be an email address) in a separate table (let's call it customer). The schema for this database doesn't reflect any relationships between this data (due to various possible reasons...DBA neglect, developer lack of initiative, did I mention "legacy"?, etc.). You are tasked with presenting any additional information you can from your environment about a particular contact, including any customer information or possible purchase history.
Using DataMesh, you would create a virtual relationship between the contact data you already access and the new data that you want to include by taking the following steps...
- provision a service for the legacy database (i.e. name is "mysql"),
- choose your primary table to access a majority of the known data, in this case the "contact" table,
select the "customer" table via the Schema tab on the DreamFactory Admin application,
- select the "username" field, and open for edit,
- select Foreign Key, then Virtual Foreign Key checkboxes which enable the Reference Table and Field selection menus,
select "contact" for the table and "email" for the field,
- and save (Update) the configuration.
This will automatically create a belongs_to relationship from the "customer" to the "contact", as well as, a has_many relationship in the opposite direction.
To see how these different relationships are represented or provisioned via the API, take a look at the following JSON snippets...
As a baseline, here is a field with a foreign key discovered or provisioned as a real database constraint, in this case our "contact" table points to a business "account" table:
{
"name": "account_id",
"type": "reference",
"is_virtual_foreign_key": false,
"is_foreign_service": false,
"ref_service": null,
"ref_table": "account",
"ref_fields": "id"
}
Here is the "customer" username field with a virtual foreign key pointing to the same service, "contact" table:
{
"name": "username",
"type": "string",
"is_virtual_foreign_key": true,
"is_foreign_service": false,
"ref_service": null,
"ref_table": "contact",
"ref_fields": "email"
}
The relationship from the perspective of the "contact" table:
{
"name": "customer_by_username",
"type": "has_many",
"field": "email",
"is_virtual": true,
"is_foreign_service": false,
"ref_service": null,
"ref_table": "customer",
"ref_fields": "username"
}
The relationship from the perspective of the "customer" table:
{
"name": "contact_by_username",
"type": "belongs_to",
"field": "username",
"is_virtual": true,
"is_foreign_service": false,
"ref_service": null,
"ref_table": "contact",
"ref_fields": "email"
}
A Service-to-Service Example
Taking the same example above, except that the "customer" table exists on a separate database, possibly on a separate server all together. The only real differences in provisioning are as follows...
- provision an additional service for the database containing the customer table(s) (i.e. name is "ecommerce"),
- when performing step 5, you would also select "Foreign Reference Service" which would allow you to select the additional service provisioned,
- take note of the "dot" notation in the relationship name designating that the relationship pointing to a different service.
And here are the JSON snippets for this scenario...
Here is the "customer" username field with a virtual foreign key pointing to the "contact" table on the "legacy" service:
{
"name": "username",
"type": "string",
"is_foreign_key": true,
"is_virtual_foreign_key": true,
"is_foreign_service": true,
"ref_service": "legacy",
"ref_table": "contact",
"ref_fields": "email"
}
The relationship from the perspective of the "contact" table:
{
"name": "ecommerce.customer_by_username",
"type": "has_many",
"field": "email",
"is_virtual": true,
"is_foreign_service": true,
"ref_service": "ecommerce",
"ref_table": "customer",
"ref_fields": "username"
}
The relationship from the perspective of the "customer" table:
{
"name": "mysql.contact_by_username",
"type": "belongs_to",
"field": "username",
"is_virtual": true,
"is_foreign_service": true,
"ref_service": "mysql",
"ref_table": "contact",
"ref_fields": "email"
}
Note that the names of the relationships are auto-generated, but can be aliased to something more desirable for use in the API.
Using the Virtual Relationships
Any of the created relationships can be used in the "related" URL query parameter, just like schema-defined relationships. In our demo data, the contact "Jane Smith" was found in the customer table with the same email.
https://demo/api/v2/mysql/_table/contact?related=customer_by_username
{
"resource": [
{
"id": 1,
"first_name": "John",
"last_name": "Smith",
"display_name": "John Smith",
"email": "johnsmith@demo.com",
"customer_by_username": []
},
{
"id": 2,
"first_name": "Jane",
"last_name": "Smith",
"display_name": "Jane Smith",
"email": "janesmith@demo.com",
"customer_by_username": [
{
"id": 2,
"display_name": "Jane Elise Smith",
"username": "janesmith@demo.com"
}
]
}
]
}
In Conclusion
There are some very powerful capabilities in DreamFactory 2.0 for combining multiple data sources. The new DataMesh feature takes our product one step further in getting all the data your application needs in a easy, clean, and succinct manner. That plus all of the other great features of DreamFactory should get you well on your way to happy app development!
Terence Bennett, CEO of DreamFactory, has a wealth of experience in government IT systems and Google Cloud. His impressive background includes being a former U.S. Navy Intelligence Officer and a former member of Google's Red Team. Prior to becoming CEO, he served as COO at DreamFactory Software.