Have data on another site that you'd like to publish to data.world for easy sharing, collaboration, visualization, and querying? No problem! As long as you have a direct URL to the file with permission to access it, data.world will be able to import it easily. This is a great solution for importing data from the web, data portals, cloud storage apps, GitHub, and API endpoints! Even better, if the files change at the source, data.world can automatically update it.
To add data from a URL click the Add data button on the dataset overview page and select the Sync from URL option:
If the URL does not require authentication to access it, all you need to do to add it to your dataset is enter the shareable link (provided by the data source) into the source URL field and select Continue:
For sources where you need permission to access the data, turn authentication on and choose one of the options in the dialog:
Headers and POST body are used to make API calls to sites that support REST API. See the support docs on those sites for required values. Though it is possible to include authentication information in headers, it is better to use the Authentication setting as collaborators on the dataset or project can see information in headers (including logins and passwords), but not Authentication values.
When you have finished entering the required information from your URL, click Continue and you'll be prompted to name the data file on data.world. If data.world encounters an issue with the source URL, we will display an error requesting you to verify the link and the settings. Hovering over the ? next to the error message will show you the exact error returned:
Clicking Edit will bring up the same dialog you used to enter the initial parameters so you can make changes:
Note that sometimes the error returned will say 404 Not Found instead of 403 Forbidden if it's an authentication failure even though the URL is correct. This result is a security feature of the API.
Syncing your data
Data that is added from a URL can be automatically updated to keep it current with its source. The options for autosync are:
- Do not sync
- Sync hourly
- Sync daily
- Sync weekly
The setting you choose affects all the files able to be synced in your dataset. The current autosync setting for your dataset is shown on the overview screen right under the Add data button. This setting can be changed at any time by selecting Autosync:
Files can also be synced manually by selecting Sync now from the source information under the file name:
If a scheduled sync fails for any reason it will be attempted again on the next scheduled interval.