Filstruktur för olika typer av kod

Directory structure for projects

A good starting point is to keep all files associated with a project in a single folder
Different projects should have separate folders
Use consistent and informative directory structure
If you need to separate public/private/secret, separate these by folder (and Git repo)
Add a README file to describe the project and instructions on reproducing the results
Talk to others in the project about what you do and write it down
Your mileage may vary: it’s not a one-size-fits-all
When software is reused in several projects it can make sense to put them in own repo

Struktur för molnbaserad kod

Struktur för textbaserade project

Struktur för kompilerad kod

A project directory can look something like this:

project_name/
├── README.md             <span class="c"># overview of the project</span>
├── data/                 <span class="c"># data files used in the project</span>
│   ├── README.md         <span class="c"># describes where data came from</span>
│   └── sub-folder/       <span class="c"># may contain subdirectories</span>
├── processed_data/       <span class="c"># intermediate files from the analysis</span>
├── manuscript/           <span class="c"># manuscript describing the results</span>
├── results/              <span class="c"># results of the analysis (data, tables, figures)</span>
├── src/                  <span class="c"># contains all code in the project</span>
│   ├── LICENSE           <span class="c"># license for your code</span>
│   ├── requirements.txt  <span class="c"># software requirements and dependencies</span>
│   └── ...
└── doc/                  <span class="c"># documentation for your project</span>
    ├── index.rst
    └── ...

Tracking source code, data, and results

All code is version controlled and goes in the src/ or source/ directory
Include appropriate LICENSE file and information on software requirements
You can also version control data files or input files under data/
If data files are too large (or sensitive) to track, untrack them using .gitignore
Intermediate files from the analysis are kept in processed_data/

Consider using Git tags to mark specific versions of results (version submitted to a journal, dissertation version, poster version, etc.):

<span class="nv">$ </span>git tag <span class="nt">-a</span> thesis-submitted <span class="nt">-m</span> <span class="s2">"this is the submitted version of my thesis"</span>

Reproducible publications

Git can be used to collaborate on manuscripts written in, e.g., LaTeX and other text-based formats but other tools exist:
- Overleaf (has Git integration)
- Authorea (apparently also has Git integration)
- Google Docs can be a good alternative
Many tools exist to assist in making scholarly output reproducible:
- rrtools: Instructions, templates, and functions for making a basic compendium suitable for writing a reproducible journal article or report with R.
- Jupyter Notebooks: Web-based interactive computational environment for creating notebook documents. Can be used for supplementary material with journal articles.
- Binder: Make a repository with Jupyter notebooks available in an executable environment.
- “Research compendia”: A set of good practices for reproducible data analysis in R, but much is transferable to other languages.
Do you want to practice your reproducibility skills and get inspired by working with other people’s code/data? Join a ReproHack event!

Källa: Organizing your projects