Licensing datasets you own
Why license your dataset?
data.world can help you expand the use of your dataset and increase your visibility, but if your dataset does not have any license terms, it means you retain all rights in the dataset, and you do not authorize anyone else to use, copy, distribute, share, combine it with other data, or make any changes to it or derivative works from it. This absence of a license greatly reduces the potential and usefulness of your dataset.
As a result, we encourage you to provide a license for your dataset and in doing so to pick as open a license as you feel comfortable to maximize the benefits of your dataset. We believe the more open a license is, the more chances others will use your dataset and recognize you for your work and as a proponent of open data. We've put together a list of common license types for datasets.
Common license considerations
Choose an established and current license
By choosing a license that is established like one from our list of common license types, you are choosing a license that is widely adopted and which was drafted by organizations dedicated to making those licenses functional in many situations as well as making them interoperable, clear and understandable. You'll need to read the actual licenses by clicking on the links we've provided to make sure you've picked the appropriate one for your dataset and how you would like others to interact with your dataset.
Consider how you want others to use your dataset
The more open a license you choose, the more others can use, share and distribute your dataset to get to insights faster. Your dataset could be important to solving a pressing issue. We encourage you to maximize your dataset's potential by choosing an open license.
Consider the results of a data project
When a project involves a number of datasets, each with different licenses, the licenses may conflict and greatly restrict or even prohibit the resulting work. By choosing the most open license, you amplify your dataset's usefulness. Another tip is to review the licenses of the other datasets that may be involved in a project or used in your industry to determine what type of license would allow your dataset to be used alongside those datasets. Usually, two datasets, both with CC-BY licenses, can be combined under those license terms. However, you will still need to pay attention to the different versions of those licenses to make sure they work with one another. In addition, just because datasets have licenses which are similar like a CC-BY and ODC-ODbL, does not mean those datasets can be combined because of conflicts between those licenses.
We like the current versions of the open Creative Commons licenses, since these licenses are widely adopted, are applicable to databases and facilitate collaboration. We believe these licenses are becoming the more widely accepted for datasets and databases. In addition, Creative Commons has created a tool to help you choose the appropriate license for your dataset.
How to license your dataset on data.world
When creating your dataset, select the applicable license from the drop down menu. If you can't find the license you would like to use listed, select "Other" and then in the summary description box, add the name of the license which applies to all your files in your dataset along with a link to the full license terms. If you would like files to have different licenses, create separate datasets based on license type and upload your files to the applicable dataset.
Licensing and datasets you found
I've found an interesting dataset and want to put it on data.world. Can I do that?
You'll need to check the licensing terms on that dataset to see if you are authorized by the owner to distribute, re-post, re-publish or share it. If those terms allow you to do these things, you'll also need to review and comply with the conditions under which you can do so. We've put together a list of common licenses for datasets with links to the license terms here.
If the dataset is available to the public on the Internet, why do I need to check and comply with the terms?
Even if datasets are publicly available, their owners can continue to have rights in those datasets. Those rights extend to how the data is organized, displayed, described, visualized, etc. and can include the effort in compiling the data. These intellectual property rights need to be respected. To do so, make sure that you read and comply with the license terms on the dataset.
What happens if I don't comply with a dataset's license or terms?
Where can I find a dataset's licensing terms and conditions?
Sometimes finding the license terms on a dataset can be difficult. You can look for them:
- On the main webpage
- On the page where the summary or description of the dataset is located
- On the download page of the dataset
- Under "legal" in the footer of the webpage
But I can't find those license terms. Now what?
After searching the site where you found the dataset, you can't locate any terms or licenses that cover the dataset, you can reach out to the owner to see if he or she will give you permission to use the dataset or put a license on the dataset on the site. A dataset that does not have any license terms means the owner retains all rights in the dataset and does not authorize anyone else to use, copy, distribute, share, combine it with other data, or make any changes to it or derivative works from it.
What about fair use?
Fair use is a tricky area. If you use copyrighted materials in a certain way that complies with the fair use doctrine, you might not be infringing on the copyright. However, courts look at the specific circumstances of the usage, so even if your usage is similar to how others have used copyrighted materials, there is no guaranty that a court will find that you have not violated someone's copyright since your circumstances may be different.
Section 107 provides the framework for determining whether something is a fair use and identifies certain types of uses—such as criticism, comment, news reporting, teaching, scholarship, and research—as examples of activities that may qualify as fair use. Section 107 calls for consideration of the following four factors in evaluating a question of fair use:
- Purpose and character of the use, including whether the use is of a commercial nature or is for nonprofit educational purposes: Courts look at how the party claiming fair use is using the copyrighted work, and are more likely to find that nonprofit educational and noncommercial uses are fair. This does not mean, however, that all nonprofit education and noncommercial uses are fair and all commercial uses are not fair; instead, courts will balance the purpose and character of the use against the other factors below. Additionally, "transformative" uses are more likely to be considered fair. Transformative uses are those that add something new, with a further purpose or different character, and do not substitute for the original use of the work.
- Nature of the copyrighted work: This factor analyzes the degree to which the work that was used relates to copyright's purpose of encouraging creative expression. Thus, using a more creative or imaginative work (such as a novel, movie, or song) is less likely to support a claim of a fair use than using a factual work (such as a technical article or news item). In addition, use of an unpublished work is less likely to be considered fair.
- Amount and substantiality of the portion used in relation to the copyrighted work as a whole: Under this factor, courts look at both the quantity and quality of the copyrighted material that was used. If the use includes a large portion of the copyrighted work, fair use is less likely to be found; if the use employs only a small amount of copyrighted material, fair use is more likely. That said, some courts have found use of an entire work to be fair under certain circumstances. And in other contexts, using even a small amount of a copyrighted work was determined not to be fair because the selection was an important part—or the "heart"—of the work.
- Effect of the use upon the potential market for or value of the copyrighted work: Here, courts review whether, and to what extent, the unlicensed use harms the existing or future market for the copyright owner's original work. In assessing this factor, courts consider whether the use is hurting the current market for the original work (for example, by displacing sales of the original) and/or whether the use could cause substantial harm if it were to become widespread.
In addition to the above, other factors may also be considered by a court in weighing a fair use question, depending upon the circumstances. Courts evaluate fair use claims on a case-by-case basis, and the outcome of any given case depends on a fact-specific inquiry. This means that there is no formula to ensure that a predetermined percentage or amount of a work—or specific number of words, lines, pages, copies—may be used without permission.