KNOTS is a quick and intuitive visual ETL tool that allows you to do complex data replication with ease. Using the visual interface, you can now easily bring your data into data.world with the power of Singer taps and targets. Taps and targets are applications that can be combined together to create simple data pipelines. Taps extract data from a source while targets consume them.
Using KNOTS, you can import data from a number of datastores on an ad-hoc basis or you can export a fully-configured knot and run it with a job scheduler to make sure your data is always up-to-date. With the intuitive interface, you can configure robust data replication processes in minutes, and KNOTS is always free.
With KNOTS and data.world, you can:
- Leverage ETL without complex manual configuration
- Quickly bring data from various datastores into data.world
- Pause and resume replication without any loss of progress
KNOTS is currently only available on MacOS. A Windows version is in the works; if you'd like to be notified when it's complete, please create a new ticket.
Download and install the latest version. For Mac, you'll want to use the DMG installer.
KNOTS currently support the following data sources:
- Amazon S3
Please let us know if there’s another source that you would love to see added.
KNOTS depends on Docker being installed and running. Docker is a tool designed to make it easier to create, deploy, and run containers. Containers allow us to package up an application or library with all of its dependencies. Each individual tap and target is packaged into its own container with the correct set of dependencies to ensure they’re all easy to use.
The installer for Docker for Mac is available here.
NOTE: Check Docker file sharing preferences and make sure that "/Users" is a shared directory.
Running the app
- From the home screen, click on "Get Started", or "New knot" on the upper right-hand corner
- Select a tap to use from the list of available taps
- Provide the configuration values required by the tap
- Click on "Continue" to run the tap in Discovery mode and determine which tables/streams are available
- Select the tables/streams that you would like to sync
- Select the data.world target from the list. To configure it, select the dataset/project that will be used to contain your data, and include your API token.
- Enter a name for the new knot, and click on "Save & Run" to execute it
Once the process has finished, click on "Done" to return to the home screen. You should now see a list of your saved knots and the various actions that can be taken on them.
- Sync new data: Sync from the point of last run
- Sync all data: Sync from the beginning
- Edit: Modify the configurations for the tap and the target
- Export: Downloads a ZIP file with the tap and target, and their configurations
- Delete: Removes the knot
Run with a scheduler
As of today, KNOTS allows you to update the data by manually clicking on the Sync new data action. By making use of the Export action though, and the resources that it provides, it's possible to set up a job that will update the data automatically on a schedule.
The exported package is a ZIP file that includes the tap and target for your knot, as well as their configurations, and a Makefile. You can read more about Makefiles here, but the gist of it is that it contains all the necessary commands to run the knot and keep your data updated. Through the use of a tool like crontab, you can run the knot from your local computer, or it can be set up to be executed from a cloud service like Amazon Web Services or Heroku.
If you run into issues or have questions, please submit a ticket to the data.world team.