Data Carpentry Workshop provides fundamental computational skills necessary for genomics research


Amar Kumar, a University of Kansas Ph.D. candidate in computational biology, is no stranger to R, a computer language used for statistical analysis. As a member of KU’s Funk Lab in pharmacology, Kumar regularly uses custom commands and scripts while analyzing data related to juvenile arthritis. When he got an email about a free Data Carpentry Genomics Workshop hosted by KU Libraries, he jumped at the chance to sharpen his skills. 

Kumar was one of nearly 20 participants from a wide-range of academic disciplines learning new data skills or adding to their knowledge during a three-day-long, hands-on workshop taught by certified Carpentries instructors in Watson Library’s Clark Instruction Center and online, January 9-11. Although nearing the end of his studies at KU, Kumar says he left the workshop with a set of new commands to improve his data. 

“I'm learning in my fifth year, but still it's helpful,” Kumar said. “It doesn't matter when you're learning it – knowledge is knowledge. I hope they’ll have a couple more workshops in the future. That'd be great for upcoming students.”

Students sit at tables during the Data Carpentry Genomics Workshop.

Data Carpentry uses periodic workshops to teach fundamental computational skills necessary for research, including how to organize, clean, and query data using open-source tools. Learners are also taught how to analyze and visualize data using a programming language. As the name implies, the most recent workshop was domain-specific, highlighting genomics data cleaning and analysis using the Unix command line and R.

Jamene Brooks-Kieffer, Data Services Librarian and Coordinator of Digital Scholarship, says the workshops help bridge a research knowledge gap with digital skills that are not often taught in classrooms and are otherwise difficult to self-learn.

“A lot of departments assume that you have this knowledge already or that you're going to pick it up at the lab,” Brooks-Kieffer said. “It is not necessarily the disciplines’ or the programs’ fault because they have a ton of other content that they need these people to learn in order to be experts in that particular field.”

Throughout the sessions of the genomics workshop, instructors went through basics in the morning sessions before introducing advanced concepts in the afternoons, all using genomic-specific data sets like the ones learners might use in their particular areas of research or lab. The hands-on instruction provided clarity for Kumar and allowed him to work at a more reasonable pace than other classes or tutorials he’s done. 

“I have taken some online courses as well where they just give you 10 minutes of introduction – although the tagline says that it’s from the basics – and they jumped onto the advanced level, and I was left thinking, ‘OK, what am I learning? Nothing.’”

Katie Hanson, a Ph.D. candidate in KU’s Molecular Biosciences was one of two instructors and a team of helpers facilitating the genomics workshop. Hanson, who uses quantitative data in her research with fruit flies, had an unpleasant experience with RStudio – a programming environment for R – early on, but with the help of a post-doctoral student and by using online tools, she gradually improved her knowledge of the program. That experience fueled her passion about teaching those data skills to KU students and researchers to help make research reproducible and more transparent.

“It’s really important to me that when people learn how to code, that they have a good experience of it and that they have a good instructor who can clearly explain it,” Hanson said. 

Hanson started out with The Carpentries workshops as a learner. After attending four workshops, she was then approached by Brooks-Kieffer with an opportunity to receive Carpentries instructor training. Thanks to KU’s Carpentries membership, affiliates, like Hanson, are offered free instructor training and certification through The Carpentries organization every year. 

KU first joined The Carpentries as a member organization in 2019. In addition to offering instructor training, the membership allows the university to host its own workshops using its own instructors. 

The membership is funded by KU Libraries, KU Research, and KU Information Technology, with additional support from Aquatic Intermittency effects on Microbiomes in Streams (AIMS), a research project headquartered at KU that studies how intermittent streams support both the environment and humankind. KU Libraries offer Software Carpentry and Data Carpentry workshops intermittently throughout the academic year. Brooks-Kieffer has been coordinating KU’s Carpentries workshops since 2016 and KU’s Carpentries membership since 2019.

The workshops switched to an online format during the pandemic. Brooks-Kieffer said that hosting an online workshop was unimaginable before COVID-19 and was even discouraged by The Carpentries organization. 

“So much of what you learn and how you learn in those workshops depends on people being able to see you when you’re getting frustrated,” she said. “Maybe you’re sitting next to somebody, and you see that they’re confused, you can lean over and kind of help them out or they can lean over and help you out.” 

Carpentries instructors at KU have taught online so often that they’re now comfortable with the online format. That familiarity came in handy during the opening day of the workshop when weather closed KU’s campus and necessitated a switch to all-virtual instruction before closing with two days of hybrid delivery. The genomics workshop is the first in-person event to take place at KU since the start of the pandemic. 

“We were able to pivot pretty quickly and figure out how to best use the online platforms that were available to do a pretty good online program,” Brooks-Kieffer said.

In addition to Hanson, the latest workshop was co-instructed by Caroline Kisielinski, a postdoctoral researcher in mammalogy at the KU Biodiversity Institute and Natural History Museum. Savannah Hay, Boryana Koseva, Daniel Montezano, Sarah Unkel and Brooks-Kieffer all served as helpers, an official role in Carpentries, “Supporting learners one-on-one if they are stuck installing software, understanding a certain line of code, or any other parts of the learning process,” according to The Carpentries Handbook.

Carpentries workshops are free and available to everyone. Workshops are offered based on the availability of instructors and in addition to genomics, have covered domain-specific instruction in ecology, social sciences and geospatial data. The number of seats in each workshop is limited to ensure the highest quality of learning and instructor-learner interaction for participants. 

More information about The Carpentries and future workshops can be found on KU Libraries’ Carpentries webpage and by subscribing to the ku-carpentries-news mailing list.