undefined thumbnail
Just Added

The Open Science Toolbox: Reproducibility Using Git, Python, Zenodo, & More

Join us for a virtual training session to boost your open science skills with Open Science Experts from DUTC!

By Don't Use This Code

Date and time

Tuesday, July 1 · 8 - 9:30am PDT

Location

Online

Refund Policy

Refunds up to 7 days before event

About this event

  • Event lasts 1 hour 30 minutes

Event Description:


The Open Science Toolbox -

Make Reproducible Research Using Git, Python, Zenodo, and More



Technologies Covered in the Seminar


  • Open data repositories (Zenodo, Data.gov) for locating research-ready datasets
  • Pixi for managing reproducible software environments
  • Git and GitHub for version control and collaboration
  • Zenodo for archiving and citing code and datasets
  • Python, with a focus on pandas for data wrangling and Matplotlib for visualization


Training Overview

Reproducibility in research starts with using the right tools. This 90-minute
seminar offers a practical introduction to essential technologies that help
researchers write clean, trackable, and shareable code. The session is
structured as a guided demo, with permanent access to a recorded walkthrough
and a companion repository of materials.

We will begin by reviewing how to find and evaluate public datasets.
Participants will learn how to assess licensing, source quality, and fitness
for use, using examples from popular repositories.

We will then look at how to set up a clean, reproducible computational
environment using Pixi. This makes it easy to install Python and the packages
needed for analysis without running into version conflicts.

From there, we will introduce version control using Git. Participants will see
how Git can be used to track changes to code in tandem with GitHub to support
collaboration, and prepare research outputs for sharing. We will also
demonstrate how to archive a project using Zenodo, making it easy for others to
cite and reuse.

Throughout the session, we will use Python to explore and visualize datasets.
Examples will focus on pandas for data processing and Matplotlib for creating
clear, publication-ready figures. The emphasis will be on writing transparent
code that others can inspect and reuse.

This seminar is for researchers who want to improve the way they manage code,
analyze data, and share results. No prior experience with these tools is
required, but even experienced users may pick up new ideas for structuring
their research workflows.


Agenda


Part 1: Find and Prepare Data

  • Overview of reproducibility and why it matters
  • Finding datasets on Zenodo, Kaggle, and Data.gov
  • Assessing licenses, documentation, and data quality

    Part 2: Environment Management and Coding Workflows
  • Setting up a clean environment with Pixi
  • Installing Python and core packages
  • Using Git to track changes and manage projects
  • Introduction to pandas and Matplotlib for basic data analysis

    Part 3: Share Your Work
  • Hosting code on GitHub
  • Archiving and citing research materials using Zenodo


What to Expect

This is a demonstration-focused seminar. Participants will not need to install
any software or follow along in real time. While we will demonstrate practical Python
examples live, this is not a hands-on coding workshop. The focus is on giving
participants a clear picture of how these tools fit together in a reproducible
research workflow.


Frequently asked questions

What do I need to participate?

You will need a computer with a stable internet connection and a willingness to learn and engage!

What if I can’t attend the live session?

No problem! Here’s how you can still benefit: - Register anyway, and we’ll send you the recording so you can watch at your convenience. - Have questions? Our team is happy to provide follow-up support via email (openscience@dutc.io).

Organized by

$19.99