RDM - Organizing Data | ROBERTSON LIBRARY

Organizing Data

Naming, Versioning & Documentation

Why Organization Matters

Data work can be messy. In time you will create multiple files, versions, and methodologies. Spending a little time upfront establishing a system can save hours of searching later.

Good file organization can help in a variety of ways:

Less time is spent searching for the right file.
Backups of data reduce the risk of data loss.
Work becomes well-documented: you know exactly what you did, how you did it, and when.
Files are created in formats that can be used now and in the future.
Progress reporting to teams, funders, and stakeholders becomes faster and easier.
Compliance with university and funder requirements is ensured.
Data become structured in ways that facilitate analysis and integration.

File Naming Conventions

Best Practices

Create meaningful names relevant to content, independent of where the file is stored.

Date Format

Use ISO 8601 (YYYY-MM-DD) or (YYYYMMDD). This ensures files sort chronologically by default.

Separators & Formatting

Use underscores (this_is_the_file) or "CamelCase" (ThisIsTheFile) to separate terms. Never use spaces.

Sorting (Zero-Padding)

If you have many files, use placeholder digits to maintain order. Use file_001.txt instead of file_1.txt.

No Special Characters

Avoid: ~ ! # & @ ( ) { } [ ] ‘ “ | % $ ; ^. These cause breakage in scripts and operating systems.

Version Control

Versioning Strategies

Don't overwrite the version you need. Establish a system to distinguish successive versions:

Dates: data_20230101, data_20230201
Ordinals: Use numbers for major changes and letters for minor changes (e.g., data_v1, data_v1.b).

Golden Rule: Never overwrite your raw master data. Save cleaned or analyzed versions as new files.

The Changelog

It is helpful to log what changed, who made the change, and why. Keep a basic text file in your folder:

# fileName_Changelog

## v2 YYYY-MM-DD J Doe <jdoe@ex.com>
* Adjusted variable labels for clarity
* Removed incomplete survey responses

## v1 YYYY-MM-DD J Doe <jdoe@ex.com>
* Initial data import

Documentation Levels

To make your data FAIR (Findable, Accessible, Interoperable, Reusable), ensure you have documentation at both the study and data level.

Study-Level (README)

Create a README.txt file at the root. Explain:

Context

Who collected the data, when, and why?

Software Requirements

What software (including version #) is needed to open these files?

Data-Level (Codebook)

Explain specific file contents:

Variable Labels

"P1_Q3" → "Participant 1, Question 3"

Codes

"999 = Missing", "0 = Control"

File Formats

Proprietary Formats

Use for analysis, but risky for long-term archiving.

Microsoft Excel (.xlsx)
SPSS (.sav)
Photoshop (.psd)

If you must use proprietary:

1. Include a README detailing the exact software/hardware version needed.
2. Share a secondary copy in an open format (e.g., Image_v1.psd AND Image_v1.tiff).

Open Formats

Preferred for preservation to ensure Interoperability.

Tables: .csv (Comma Separated Values)
Text: .txt (Plain Text)
Images: .tiff (Lossless Compression)

Note on Compression: If compression is necessary, always use a lossless format to prevent data degradation.

Language English

Ask Us

Use this chat box to talk to us live!

Search

Most popular search tools:

Search Tools by Subject

Computers

Borrowing Laptops

Your Laptop

More information about computers

Print/Scan

PRINTING

SCANNING FREE

PEI Collections

PEI Collection & University Archives

Digital Collections

Tours/Tutorials

Tours

Book a Space

*Study rooms are not available for booking when Robertson Library is closed for renovations between May - August
Group and individual study rooms are available for quietly studying or working on assignments.

Book a Room Now

Subject Guides

Citation guides - APA, MLA, CSE, Chicago/Turabian

Writing Centre guide

FAQ

Building Update