Best Practices:
- Use the Summary section to tell your data’s story.
- Where did the data come from? Cite and link to your sources or include your details for a 'citation request'. Not only does this give "credit where credit is due," it helps other members evaluate the data's suitability for their needs.
- If you think a particular piece of context will be useful to other members, add it.
- The best Summaries cover the “who, what, where, when, why, and how” of the data.
- What’s the data telling you? What would others be interested to know about it? What have others found using this data?
- If the data has associated data dictionaries or other documentation, upload it and then link to it from your Summary.
- Make it visually friendly with Markdown styling. It’s easy to learn and goes a long way.
- Where did the data come from? Cite and link to your sources or include your details for a 'citation request'. Not only does this give "credit where credit is due," it helps other members evaluate the data's suitability for their needs.
- Tag your datasets to improve discoverability. Add multiple tags using your tab key.
- Contribute to existing datasets via the discussion tab to add comments or to ask the owner to add you as a collaborator.
Helpful Tips:
- Make sure each column in your data files has a single header. Data.world will use these headers for previews, insights and queries.
- Remove any headers, footers or notes outside of the single row of column headers from the data file. Include the removed content in the dataset summary or upload as a separate notes file in the dataset. Keeping the data file basic ensures data.world will import and analyze the data with the best accuracy.
- Use complete descriptors in dataset titles to help drive better search results. For example, instead of "HOUHouses" use "2015 Houston House Prices".
- Files within a dataset are displayed alphabetically, so if the files in your dataset should be displayed in a particular order, name them accordingly (01_*.xls, 02_*.pdf, etc.) or use the summary to take others through your data and analysis.
- Search first to see if the same dataset has already been uploaded, and if so, consider collaborating or linking directly to that dataset rather than uploading a duplicate. There’s nothing wrong with uploading your own copy, but sharing through collaboration or direct linking will keep that data’s ‘story’ in one place.
Common Gotchas:
- Multiple headers, merged cells and other complex formatting can result in inference issues upon upload into data.world. For best results, remove complex formatting from Excel/Numbers workbooks or upload as a CSV.
- Improperly escaped characters within CSV can cause import and inference issues; follow the definition of the CSV format documented here: http://www.ietf.org/rfc/rfc4180.txt#page-1.
- Date columns can cause inference issues upon upload into data.world. If dates are included in your dataset, we recommend converting the file to csv or tsv before uploading.
- Ensure your dataset is within the data.world size limitations.
Comments
0 comments
Please sign in to leave a comment.