Data work can be messy. In time you will create multiple files, versions, and methodologies. Spending a little time upfront establishing a system can save hours of searching later.
- Less time is spent searching for the right file.
- Backups of data reduce the risk of data loss.
- Work becomes well-documented: you know exactly what you did, how you did it, and when.
- Files are created in formats that can be used now and in the future.
- Progress reporting to teams, funders, and stakeholders becomes faster and easier.
- Compliance with university and funder requirements is ensured.
- Data become structured in ways that facilitate analysis and integration.
Create meaningful names relevant to content, independent of where the file is stored.
YYYY-MM-DD) or (YYYYMMDD). This ensures files sort chronologically by default.this_is_the_file) or "CamelCase" (ThisIsTheFile) to separate terms. Never use spaces.file_001.txt instead of file_1.txt.Don't overwrite the version you need. Establish a system to distinguish successive versions:
- Dates:
data_20230101,data_20230201 - Ordinals: Use numbers for major changes and letters for minor changes (e.g.,
data_v1,data_v1.b).
It is helpful to log what changed, who made the change, and why. Keep a basic text file in your folder:
## v2 YYYY-MM-DD J Doe <jdoe@ex.com>
* Adjusted variable labels for clarity
* Removed incomplete survey responses
## v1 YYYY-MM-DD J Doe <jdoe@ex.com>
* Initial data import
To make your data FAIR (Findable, Accessible, Interoperable, Reusable), ensure you have documentation at both the study and data level.
Create a README.txt file at the root. Explain:
Explain specific file contents:
Use for analysis, but risky for long-term archiving.
- Microsoft Excel (.xlsx)
- SPSS (.sav)
- Photoshop (.psd)
1. Include a README detailing the exact software/hardware version needed.
2. Share a secondary copy in an open format (e.g., Image_v1.psd AND Image_v1.tiff).
Preferred for preservation to ensure Interoperability.
- Tables: .csv (Comma Separated Values)
- Text: .txt (Plain Text)
- Images: .tiff (Lossless Compression)
Note on Compression: If compression is necessary, always use a lossless format to prevent data degradation.