Integrating Third-Party Libraries With DreamFactory Scripting
Being able to script business logic into your endpoints to further extend and customize the data response(s) and request(s) is one of the best features inside of DreamFactory. As a form of ETL data integration, DreamFactory can help you create a data lake or data warehouse, especially when combined with a service such as Xplenty. Sometimes cases arise where you need to take advantage of the robust third-party library community to help extend your API. In the example below, we are going to use a third-party library to help us connect our DreamFactory instance to a Twitter Account to post updates. We also bring in the conecpt of connecting the Twitter service to our Rapsberry Pi IoT device to post temperature information to show you how can connect multiple services through DreamFactory to manage your data needs in a simple, robust method that makes API mangement a snap.
Tokyo, midday on a Friday. My phone kept buzzing during my customer meeting. The day was a workday, a day of travel and the day a personal property purchase was due to settle. Little did I know it was also the day I was the target of a “port-out” scam.
In the DreamFactory Tokyo office in Azabu Juban, Minato, in the few minutes I had between one meeting and the next, I glanced at my phone trying to keep pace with solicitor updates regarding my property transaction finalizing. Amidst the deluge of notifications my phone was providing, one thing I quickly dismissed were two verification SMS codes sent by my telco provider that I hadn’t asked for and didn’t give a second thought to:
It’s hard to believe the year’s end is almost upon us! This has been a pretty transformative year for the company. We’ve seen record demand for the DreamFactory Platform, and have additionally been working around the clock on a number of new initiatives:
If you want to spin up a fast API solution, DreamFactory is a great way to do that with a Bitnami install. Within minutes you can have a fully documented and secure REST API to utilize. Just like any program bundle, there are lots of features to learn and interact with. Outside of a Docker Swarm or AWS ELB setup, it is pretty hard to find a way to spin up a DreamFactory instance faster. We are going to dive in a bit further to find out how to interact with the system database.Continue reading “Learning About The Bitnami System Database”
Your manager’s peers have been bragging a lot lately about their data warehouses, analytics, and charts, and now a steady stream of data-related questions are being sent your way. Your department maintains several databases, and the data they contain has the potential to answer everything management is asking for. But the databases are needed for day-to-day operations, and can’t scale to answer these often highly specific questions such as, “How many asparaguses were consumed by men named Fonzie in Cleveland on Tuesdays in 2013?”. How to unlock the potential of this data?
You’ve probably heard of data warehouses, which are tailor-made for this sort of witchcraft. They make it possible to unlock every bit of value from data, and find answers wickedly fast. In the past, creating and maintaining data warehouses meant large, ongoing investments in hardware, software, and people to run them. This would be a hard sell – isn’t the company already spending enough?! Good news, however! In this day of cloud computing, it’s incredibly simple to create, load, and query data warehouses. They typically charge on a usage basis, meaning you don’t need the initial upfront capital investment to get off the ground. And they are super fast – far more powerful than anything you could run in-house.
This post will focus on Amazon Web Services Redshift (Amazon Web Services = AWS). And as a bonus, I’ll demonstrate the incredible Dreamfactory, which automatically builds a slick REST API interface over the top. From there, you’re a GUI away from giving management everything they could ask for, and wowing them with extras they hadn’t even thought of. They can now stand tall amongst their fellow executives, knowing you have their back.
AWS Redshift is built upon PostgreSQL, but has been dramatically enhanced to run at “cloud scale” within AWS. There are a few ingredients to this secret sauce:
While you don’t need a deep understanding of what’s happening under the hood to use it, Redshift employs a fascinating approach to achieve it’s mind-boggling performance.
Let’s say you have data that looks like the following:
ID NAME CREATED DESCRIPTION AMOUNT
1 Harold 2018/01/01 Membership 10.00
2 Susan 2017/11/15 Penalty 5.00
3 Thomas 2016/10/01 Membership 8.00
Most SQL databases you’ve probably used in the past are row-based, which means they store their data something like this:
This is the efficient way to maximize storage, and works well for retrieving data in the “traditional fashion” (rows at a time). But when you want to slice and dice this data, it doesn’t scale very well. If you’ve got large (business-scale) volumes of data, and a variety of ways you want to query it, you can really start to strain your database.
Column-based databases, on the other hand, flip this idea on its head, and store the information in a column-based format, with the *data* serving as the *key*. So the above might look something like this:
This drastically improves query performance. For example, when searching for “DESCRIPTION == ‘Membership'”, the query only needs to make one database call (“give me the items with a ‘DESCRIPTION’ of ‘Membership'”), instead of inspecting each row individually (as it would have to do in a traditional, row-based database). Very cool, very fast!
When I picture what the AWS cloud must look like, I usually conjure something up from the Matrix (except it’s full of regular computers, rather than, well, humans). Or maybe Star Trek’s “Borg”, a ridiculous planet-cube flying through space, sucking up other civilizations. I guess both of those images are a little disturbing. A safer mental image is this – data centers spanning the globe, loaded with racks and racks of computers, all connected and working together.
For most computing tasks, throwing more hardware at the problem doesn’t automatically increase performance. There are bottlenecks that remain in place no matter how many processors are churning away. In our “traditional database” example, this bottleneck is typically disk I/O – the processors are all trying to grab data from the same place. To overcome this, the architecture and storage have to be arranged in a way that can benefit from parallelization.
Which is exactly the case with AWS Redshift. Due to the column-based design described above, Redshift is able to take full advantage of adding processors, and it’s almost linearly scalable. This means if you double the number of computers (“nodes”, in Redshift-speak), the performance doubles. And so on. Combine this scalability with the ridiculous number of computers AWS has at it’s disposal (specifically, several Borgs-worth), and it’s like staring out at a starry night. It goes on forever in all directions.
How this works for you
If you’re sold on the power of AWS Redshift, then you’ll be pleased to learn that setup is incredibly simple. AWS documentation is top notch, a crucial thing in this brave new world. When writing this post, I followed their tutorial, and it all went smoothly. Probably took me 15 minutes, and I had the example up and running.
If you already have SQL expertise, you won’t have any problem picking up Redshift syntax. There are some differences and nuances, but the standard “things” (joins, where clauses, etc) all work as expected. I typically use Microsoft’s SQL Server Management Studio (SSMS), and was able to connect to Redshift with no problem (after setting it up as a linked server). Your favorite SQL client will presumably work here as well (anything that supports JDBC or ODBC drivers).
Once you get your feet wet, there are myriad tools that will load your business data into Redshift. If you’ve got SQL chops in house, I’d start with the AWS documentation, and go from there. If you need a little (or a lot) of help, a whole ecosystem of companies and tools have sprung up around Redshift. A quick Google search will introduce you to them.
When you’re up and running, and growing more comfortable demanding more from the system, AWS makes it incredibly simple to add capacity. Thanks to the brilliant Redshift architecture, you just add nodes, and AWS takes care of the rest. Their billing dashboard will show you what it’s costing in real time, with no hidden or creeping costs of data centers, hardware upgrades, things going bump in the night, etc. So much magic happening under the covers, and you get the credit. The joys of cloud computing!
My Humble Example
When writing this, I used the example AWS provides (it consists of a few tables containing some fake Sales data). With everything in place, I can query from SSMS (with a little bit of “linked server” glue syntax):
exec ('-- Find total sales on a given calendar date.
FROM sales, date
WHERE sales.dateid = date.dateid
AND caldate = ''2008-01-05'';') at redshift
(1 row affected)
I get a thrill when a chain of systems, architectures, and networks all flow together nicely. Somewhere in a behemoth of a data center, a processor heard my cry, and spun out this result in response. Amazing.
Now that the company has access to the data, and can gleefully ask any question, they are going to want the dashboards and pretty graphs. Typically you’d use a REST API to feed the data to some sort of UI, but how to do this with Redshift? While management is tickled with their new toy, they will cloud over with suspicion if you now propose a months-long project to make it shinier.
In keeping with the theme of “easy, automatic, and powerful”, I’d propose using DreamFactory. In a matter of minutes (literally), it will connect to a data store (both SQL or NoSQL), intelligently parse all the schema, and spin up a REST API layer for doing all the things (complete with attractive documentation). What used to take a team of developers months can now happen in an afternoon!
Here are some screenshots of my REST API, completely auto generated from the Redshift example above. It took me about 15 minutes (12 of those spent poking around the documentation) to get this done. For my simple example, I followed their Docker instructions, and in no time was playing with the REST API depicted below:
To Infinity and Beyond!
Now that you’ve witnessed how easily you can warehouse all your data, and bootstrap it into a REST API, it’s time to bring this to your organization. Play with it a little, get comfortable with the tools, then turn up the dials. Want to learn more about how DreamFactory and Redshift can work together (or how to put a REST API on any database)? Schedule a demo with us. The next time management comes calling for data, you can give it to them with a fire hose!
Consider a query which joins employee records found in an employees table with information about their assigned department, the latter of which resides in a table named departments. The relationship is formalized using a key named emp_no. When DreamFactory parses the schema it will create aliases for each relationship, including one for the above-described named something like dept_emp_by_emp_no. The join query will therefore look like this:
Note: We have updated the instructions here to match our DF-Docker repo instructions. This will pull the latest GitHub repo now.
DreamFactory can be run as a Docker container, which makes it easier than ever to get the backend for your apps up and running. The DreamFactory Docker image is available on Docker Hub, or you can build your own image from the GitHub repo. Using these two methods, I’ll show you how to use Docker to fire up your own DreamFactory instance in just a few steps. This setup uses MySQL for the system database and Redis for the system cache. The basic idea is that you first start the containers for MySQL and Redis, then a container for DreamFactory which links to the others.
So what happens if you make a mistake and expose your admin app api_key or just need to change api_key associated with one of your apps? We have an easy workaround that doesn’t require you to have to change any of your endpoints or having to recreate an app, etc. This article shows you how to access all of your app API keys via MySQL or, if you haven’t fully started exploring DreamFactory yet, the default SQLite database.
One of the most important ideas in the world of software engineering is the concept of loose coupling. In a loosely coupled design, components are independent, and changes in one will not affect the operation of others. This approach offers optimal flexibility and reusability when components are added, replaced, or modified. Conversely, a tightly coupled design means that components tend to be interdependent. Changes in a single component can have a system wide impact, with unanticipated and undesirable effects.