- Make sure each column in your data files has a single header. Data.world will use these headers for previews, insights and queries.
- Remove any headers, footers or notes outside of the single row of column headers from the data file. Include the removed content in the dataset summary or upload as a separate notes file in the dataset. Keeping the data file basic ensures data.world will import and analyze the data with the best accuracy.
- Use complete descriptors in dataset titles to help drive better search results. For example, instead of "HOUHouses" use "2015 Houston House Prices".
- Files within a dataset are displayed alphabetically, so if the files in your dataset should be displayed in a particular order, name them accordingly (01_*.xls, 02_*.pdf, etc.) or use the summary to take others through your data and analysis.
- Search first to see if the same dataset has already been uploaded, and if so, consider collaborating or linking directly to that dataset rather than uploading a duplicate. There’s nothing wrong with uploading your own copy, but sharing through collaboration or direct linking will keep that data’s ‘story’ in one place.
- Multiple headers, merged cells and other complex formatting can result in inference issues upon upload into data.world. For best results, remove complex formatting from Excel/Numbers workbooks or upload as a CSV.
- Improperly escaped characters within CSV can cause import and inference issues; follow the definition of the CSV format documented here: http://www.ietf.org/rfc/rfc4180.txt#page-1.
- Date columns can cause inference issues upon upload into data.world. If dates are included in your dataset, we recommend converting the file to csv or tsv before uploading.
- Ensure your dataset is within the data.world size limitations.