If you want to spin up a fast API solution, DreamFactory is a great way to do that with a Bitnami install. Within minutes you can have a fully documented and secure REST API to utilize. Just like any program bundle, there are lots of features to learn and interact with. Outside of a Docker Swarm or AWS ELB setup, it is pretty hard to find a way to spin up a DreamFactory instance faster. We are going to dive in a bit further to find out how to interact with the system database.
NOTE: If you change the schema or a setting inside of this database, the system may become unstable and lose some or all functionality. This is only recommended for advanced users.
One feature that is hidden within the Bitnami install is the ability to see the system database that is managing your DreamFactory instance. In the image below, the MySQL DreamFactory instance system database is shown.Inside of a Bitnami instance, things are different. If you select the MySQL database that ships with the instance you will not see a schema. Fear not, there is an easy solution!Understanding that there would be no way for you to log into the instance without a system database, let alone be able to see anything in the Schema tab, means it must be somewhere. But where? The system database is not picked up by default as a service in Bitnami. If you are interacting with the DreamFactory admin interface in any way at all, you have a properly configured system database.
Where is the System Database, and How Do I Get There?
To interact with the system database in Bitnami, you just need to take a look at your .env file. This file is located inside of the htdocs folder, which is inside of the install path (this can be seen on the config tab). The system database credentials will be in the Database Settings section. Bitnami names the system database by default something similar to bitnami_dreamfactory or bn_dreamfactory.You will notice that you can access the system database endpoints regardless of not being able to change the schema in the Schema tab. If you migrate to one of the system endpoints in the API docs tab, the info will still be returned as normal. Below is an example of the GET /system/environment endpoint in my local Bitnami instance:Under the Services tab, you can create a new MySQL database service and use the credentials from your .env file to add the system database to your API Docs and Schema tabs, so that you can interact with them. With the discovery of the system database available you can now see full Swagger documentation. Furthermore, you can interact with the endpoints as normal. For example, you can use the endpoint GET /_schema to see the schema as shown below.From there you can run amok in the system database. Again, one last warning, please realize that if you are not careful, you might CRUD your way into a non-functioning system. The bright side to that as we said up top…you can just spin up another and that my friends, is the beauty of Bitnami.
Your manager’s peers have been bragging a lot lately about their data warehouses, analytics, and charts, and now a steady stream of data-related questions are being sent your way. Your department maintains several databases, and the data they contain has the potential to answer everything management is asking for. But the databases are needed for day-to-day operations, and can’t scale to answer these often highly specific questions such as, “How many asparaguses were consumed by men named Fonzie in Cleveland on Tuesdays in 2013?”. How to unlock the potential of this data?
You’ve probably heard of data warehouses, which are tailor-made for this sort of witchcraft. They make it possible to unlock every bit of value from data, and find answers wickedly fast. In the past, creating and maintaining data warehouses meant large, ongoing investments in hardware, software, and people to run them. This would be a hard sell – isn’t the company already spending enough?! Good news, however! In this day of cloud computing, it’s incredibly simple to create, load, and query data warehouses. They typically charge on a usage basis, meaning you don’t need the initial upfront capital investment to get off the ground. And they are super fast – far more powerful than anything you could run in-house.
This post will focus on Amazon Web Services Redshift (Amazon Web Services = AWS). And as a bonus, I’ll demonstrate the incredible Dreamfactory, which automatically builds a slick REST API interface over the top. From there, you’re a GUI away from giving management everything they could ask for, and wowing them with extras they hadn’t even thought of. They can now stand tall amongst their fellow executives, knowing you have their back.
AWS Redshift is built upon PostgreSQL, but has been dramatically enhanced to run at “cloud scale” within AWS. There are a few ingredients to this secret sauce:
While you don’t need a deep understanding of what’s happening under the hood to use it, Redshift employs a fascinating approach to achieve it’s mind-boggling performance.
Let’s say you have data that looks like the following:
ID NAME CREATED DESCRIPTION AMOUNT
1 Harold 2018/01/01 Membership 10.00
2 Susan 2017/11/15 Penalty 5.00
3 Thomas 2016/10/01 Membership 8.00
Most SQL databases you’ve probably used in the past are row-based, which means they store their data something like this:
This is the efficient way to maximize storage, and works well for retrieving data in the “traditional fashion” (rows at a time). But when you want to slice and dice this data, it doesn’t scale very well. If you’ve got large (business-scale) volumes of data, and a variety of ways you want to query it, you can really start to strain your database.
Column-based databases, on the other hand, flip this idea on its head, and store the information in a column-based format, with the *data* serving as the *key*. So the above might look something like this:
This drastically improves query performance. For example, when searching for “DESCRIPTION == ‘Membership'”, the query only needs to make one database call (“give me the items with a ‘DESCRIPTION’ of ‘Membership'”), instead of inspecting each row individually (as it would have to do in a traditional, row-based database). Very cool, very fast!
When I picture what the AWS cloud must look like, I usually conjure something up from the Matrix (except it’s full of regular computers, rather than, well, humans). Or maybe Star Trek’s “Borg”, a ridiculous planet-cube flying through space, sucking up other civilizations. I guess both of those images are a little disturbing. A safer mental image is this – data centers spanning the globe, loaded with racks and racks of computers, all connected and working together.
For most computing tasks, throwing more hardware at the problem doesn’t automatically increase performance. There are bottlenecks that remain in place no matter how many processors are churning away. In our “traditional database” example, this bottleneck is typically disk I/O – the processors are all trying to grab data from the same place. To overcome this, the architecture and storage have to be arranged in a way that can benefit from parallelization.
Which is exactly the case with AWS Redshift. Due to the column-based design described above, Redshift is able to take full advantage of adding processors, and it’s almost linearly scalable. This means if you double the number of computers (“nodes”, in Redshift-speak), the performance doubles. And so on. Combine this scalability with the ridiculous number of computers AWS has at it’s disposal (specifically, several Borgs-worth), and it’s like staring out at a starry night. It goes on forever in all directions.
How this works for you
If you’re sold on the power of AWS Redshift, then you’ll be pleased to learn that setup is incredibly simple. AWS documentation is top notch, a crucial thing in this brave new world. When writing this post, I followed their tutorial, and it all went smoothly. Probably took me 15 minutes, and I had the example up and running.
If you already have SQL expertise, you won’t have any problem picking up Redshift syntax. There are some differences and nuances, but the standard “things” (joins, where clauses, etc) all work as expected. I typically use Microsoft’s SQL Server Management Studio (SSMS), and was able to connect to Redshift with no problem (after setting it up as a linked server). Your favorite SQL client will presumably work here as well (anything that supports JDBC or ODBC drivers).
Once you get your feet wet, there are myriad tools that will load your business data into Redshift. If you’ve got SQL chops in house, I’d start with the AWS documentation, and go from there. If you need a little (or a lot) of help, a whole ecosystem of companies and tools have sprung up around Redshift. A quick Google search will introduce you to them.
When you’re up and running, and growing more comfortable demanding more from the system, AWS makes it incredibly simple to add capacity. Thanks to the brilliant Redshift architecture, you just add nodes, and AWS takes care of the rest. Their billing dashboard will show you what it’s costing in real time, with no hidden or creeping costs of data centers, hardware upgrades, things going bump in the night, etc. So much magic happening under the covers, and you get the credit. The joys of cloud computing!
My Humble Example
When writing this, I used the example AWS provides (it consists of a few tables containing some fake Sales data). With everything in place, I can query from SSMS (with a little bit of “linked server” glue syntax):
exec ('-- Find total sales on a given calendar date.
FROM sales, date
WHERE sales.dateid = date.dateid
AND caldate = ''2008-01-05'';') at redshift
(1 row affected)
I get a thrill when a chain of systems, architectures, and networks all flow together nicely. Somewhere in a behemoth of a data center, a processor heard my cry, and spun out this result in response. Amazing.
Now that the company has access to the data, and can gleefully ask any question, they are going to want the dashboards and pretty graphs. Typically you’d use a REST API to feed the data to some sort of UI, but how to do this with Redshift? While management is tickled with their new toy, they will cloud over with suspicion if you now propose a months-long project to make it shinier.
In keeping with the theme of “easy, automatic, and powerful”, I’d propose using DreamFactory. In a matter of minutes (literally), it will connect to a data store (both SQL or NoSQL), intelligently parse all the schema, and spin up a REST API layer for doing all the things (complete with attractive documentation). What used to take a team of developers months can now happen in an afternoon!
Here are some screenshots of my REST API, completely auto generated from the Redshift example above. It took me about 15 minutes (12 of those spent poking around the documentation) to get this done. For my simple example, I followed their Docker instructions, and in no time was playing with the REST API depicted below:
To Infinity and Beyond!
Now that you’ve witnessed how easily you can warehouse all your data, and bootstrap it into a REST API, it’s time to bring this to your organization. Play with it a little, get comfortable with the tools, then turn up the dials. Want to learn more about how DreamFactory and Redshift can work together (or how to put a REST API on any database)? Schedule a demo with us. The next time management comes calling for data, you can give it to them with a fire hose!
Consider a query which joins employee records found in an employees table with information about their assigned department, the latter of which resides in a table named departments. The relationship is formalized using a key named emp_no. When DreamFactory parses the schema it will create aliases for each relationship, including one for the above-described named something like dept_emp_by_emp_no. The join query will therefore look like this:
Note: We have updated the instructions here to match our DF-Docker repo instructions. This will pull the latest GitHub repo now.
DreamFactory can be run as a Docker container, which makes it easier than ever to get the backend for your apps up and running. The DreamFactory Docker image is available on Docker Hub, or you can build your own image from the GitHub repo. Using these two methods, I’ll show you how to use Docker to fire up your own DreamFactory instance in just a few steps. This setup uses MySQL for the system database and Redis for the system cache. The basic idea is that you first start the containers for MySQL and Redis, then a container for DreamFactory which links to the others.
So what happens if you make a mistake and expose your admin app api_key or just need to change api_key associated with one of your apps? We have an easy workaround that doesn’t require you to have to change any of your endpoints or having to recreate an app, etc. This article shows you how to access all of your app API keys via MySQL or, if you haven’t fully started exploring DreamFactory yet, the default SQLite database.
One of the most important ideas in the world of software engineering is the concept of loose coupling. In a loosely coupled design, components are independent, and changes in one will not affect the operation of others. This approach offers optimal flexibility and reusability when components are added, replaced, or modified. Conversely, a tightly coupled design means that components tend to be interdependent. Changes in a single component can have a system wide impact, with unanticipated and undesirable effects.
DreamFactory 2.10 is live! The 2.10 release adds support for the OpenAPI 3.0 spec, role-based access control for the DreamFactory Admin App, and a number of bug fixes. Head on over to Bitnamior GitHub to install DreamFactory 2.10. Here’s what’s new: