Recent graduate Seth Berke didn’t expect to leave Hopkins interested in pursuing a research career but, after using cloud computing methods to analyze genomic data, that’s exactly what’s happened. Berke works with biostatistician Ingo Ruczinski where he develops more efficient methods of employing and gaining insight from preexisting data sets.
Many genomic databases operate under the FAIR Data Principles. FAIR data must be findable, accessible, interoperable and reusable. While these FAIR databases exist — including many operated by the National Institutes of Health (NIH) — it can be difficult to meaningfully access them due to a variety of complications. Berke’s focus has been on developing cloud pathways capable of wrestling with this complexity.
“FAIR data seeks to further biomedical research through this lens where everyone has access to it, but the issue and the barrier is that often there’s no standard way of analyzing it. There are also barriers with data management, metadata and other reasons that people struggle with analyzing this data that’s available to everyone,” he said. “We use cloud computing technology to import this data, to analyze it and find meaningful results.”
The challenges posed by data storage and metadata necessitate the usage of cloud-based technologies over local analysis. Working with CAVATICA — a genomics cloud computing platform funded in part by the NIH — Berke has imported code and raw data into the cloud, analyzed it and retrieved results.
“We’re essentially bringing the data analysis to the data, and not the other way around, because trying to download terabytes of files onto your laptop is just not going to work,” he said. “Right now, we’re looking at orofacial clefts, and we found some new genes that we’re further investigating for causality for orofacial clefts.”
While the specific genes connected to orofacial clefts elucidated through this work may prove vital in developing new treatments going forward, according to Berke, the exciting part of this research is that it provides a general means of analyzing genomic data, not simply an answer to a particular question. The same analytical tools can be employed to study cancer, cystic fibrosis or any other genetically linked condition.
“While the phenotype that we are doing is orofacial clefs, our lab is more of a methods or a feasibility lab focused on harnessing these resources… There are huge translational and medical applications to all this work so if you’re able to go in and analyze data that already exists, you can potentially lead the way for new treatments and then translate it into actual applicable medical findings,” Berke said.
Berke was first drawn to Ruczinski’s lab due to the specific focus on orofacial clefts. As a pre-dental student, he hoped to get involved in any dental research taking place at Hopkins. However, as he got involved in the project and began considering the role that science may play in his life moving forward, Berke realized that the dental path no longer was able to offer him everything he needed. Having graduated in December 2023, Berke now hopes to pursue a PhD in genomics.
Part of the rationale behind this shift in direction was seeing the immediacy and impact of scientific work firsthand. In doing so, he realized the impact that researchers have on the medical field as a whole.
“In school, you learn broadly about research and about science, but when you’re actually in a lab, you understand how it translates into the real world. The biggest thing when I first started was that, ‘Wow, these findings will be sent to a journal that can be read by geneticists, by dentists, by people interested in this area of research, and it could contribute to an actual cause,’” he said. “I shifted my way of learning from the broad to how to impact the world. What skills do I have to help?”
Among this exploration was the opportunity to present a research poster at the 2023 International Genetic Epidemiology Society Conference in Nashville, Tenn. This opportunity enabled him to directly engage with cutting-edge work across the genomics world.
“Going to the conference was one of the most meaningful things I’ve done. There were a lot of other people in academia just sharing their ideas, so I learned a tremendous amount and was happy just to contribute to overall research just by being there and having those discussions,” Berke said.
At the same time that research has shaped his Hopkins experience and future goals, Berke acknowledges the struggles that accompany beginning a new project — particularly those with a steep learning curve — and offered thanks for Ruczinski’s mentorship alongside that from Kanika Kanchan, a post-doc in Ruczinski’s lab, and Eric Tobin and Cera Fisher who work at Velsera, a computational genomics company that provides the technology used to run lab workflows. Berke described all of them as providing constant help and support.
“At first, I felt kind of lost in the field. I didn’t have a good sense of what cloud computing was even after reading the documentation,“ he said. “It was really hard and took a great deal of just outside learning to get to the point where I understood the right questions to ask… The hardest part is definitely the beginning, and since then things have been a lot more straightforward.”
Research on the Record spotlights undergraduate students involved in STEM research at Hopkins. The goal of the column is to share reflections on the highs and lows that Hopkins students experience in their contributions to undergraduate research. If you are an undergraduate researcher interested in being profiled, reach out to [email protected].
Wanda Parisien is a computing expert who navigates the vast landscape of hardware and software. With a focus on computer technology, software development, and industry trends, Wanda delivers informative content, tutorials, and analyses to keep readers updated on the latest in the world of computing.