Machine learning enhances study of 3D genome structure in cell nucleus

Computational methods used to fill in missing pixels in low-quality images or video also can help scientists provide missing information for how DNA is organized in the cell, computational biologists at Carnegie Mellon University have shown.

Filling in this missing information will make it possible to more readily study the 3D structure of chromosomes and, in particular, subcompartments that may play a crucial role in both disease formation and determining cell functions, said Jian Ma, associate professor in CMU’s Computational Biology Department.

In a research paper published today by the journal Nature Communications, Ma and Kyle Xiong, a CMU Ph.D. student in the CMU-University of Pittsburgh Joint Ph.D. Program in Computational Biology, report that they successfully applied their machine learning method to nine cell lines. This enabled them, for the first time, to study differences in spatial organization related to subcompartments across those lines.

Previously, subcompartments could be revealed in only a single cell type of lymphoblastoid cells — a cell line known as GM12878 — that has been exhaustively sequenced at great expense using Hi-C technology, which measures spatial interactivity among all regions of the genome.

“We now know a lot about the linear composition of DNA in chromosomes, but in the nuclei of human cells, DNA isn’t linear,” Xiong said. “Chromosomes in the cell nucleus are folded and packaged into 3D shapes. That 3D structure is critical to understanding the cellular functions in development and diseases.” Subcompartments are of particular interest because they reflect spatial segregation of chromosome regions with high interactivity.

Scientists are eager to learn more about the juxtaposition of subcompartments and how it affects cell function, Ma said. But until now researchers could calculate the patterns of subcompartments only if they had an extremely high coverage Hi-C dataset — that is, the DNA had been sequenced in great detail to capture more interactions. That level of detail is missing in the datasets for cell lines other than GM12878.

Working with Ma, Xiong used an artificial neural network called a denoising autoencoder to help fill in the gaps in less-than-complete Hi-C datasets. In computer vision applications, the autoencoder can supply missing pixels by learning what types of pixels typically are found together and making its best guess. Xiong adapted the autoencoder to high-throughput genomics, using the dataset for GM12878 to train it to recognize what sequences of DNA pairs from different chromosomes typically might be interacting with each other in 3D space in the cell nucleus.

This computational method, which Ma and Xiong have dubbed SNIPER, proved successful in identifying subcompartments in eight cell lines whose interchromosomal interactions based on Hi-C data were only partially known. They also applied SNIPER to the GM12878 data as a control. But Xiong noted that it is not yet known how widely this tool can be used on all other cell types. He and Ma are continuing to enhance the method, however, so it can be used on a variety of cellular conditions and even in different organisms.

“We need to understand how subcompartment patterns are involved in the basic functions of cells, as well as how mutations can affect these 3D structures,” Ma said. “Thus far, in the few cell lines we’ve been able to study, we see that some subcompartments are consistent across cell types, while others vary. Much remains to be learned.”

The National Institutes of Health and the National Science Foundation supported this work.

Story Source:

Materials provided by Carnegie Mellon University. Note: Content may be edited for style and length.

Source link

Android users will soon be able to pause Google’s anti-malware service for sideloading

Black holes: not endings, but beginnings? New research could revolutionize our understanding of the universe

AMD calls demand for Radeon 9070 and 9070 XT “unprecedented,” says restocking at MSRP is priority number one

IBM CEO says AI will boost programmers, not replace them

Reeves insists changes to welfare needed as MPs call for flexibility

Entwined dwarf stars reveal their location thanks to repeated radio bursts

Machine learning enhances study of 3D genome structure in cell nucleus — ScienceDaily

Android users will soon be able to pause Google’s anti-malware service for sideloading

Black holes: not endings, but beginnings? New research could revolutionize our understanding of the universe

AMD calls demand for Radeon 9070 and 9070 XT “unprecedented,” says restocking at MSRP is priority number one

IBM CEO says AI will boost programmers, not replace them

Reeves insists changes to welfare needed as MPs call for flexibility

Entwined dwarf stars reveal their location thanks to repeated radio bursts

Ecosystem benefits to humanity expected to decline by 9% by 2100

Toothbrushing tied to lower rates of pneumonia among hospitalized patients

Mesopotamian bricks unveil the strength of Earth’s ancient magnetic field

Apes remember friends they haven’t seen for decades

Snowflakes swirling in turbulent air as they fall through a laser light sheet

Ringing in the holidays with ringed planet Uranus

Michael O’Neill: Stoke City set to appoint Northern Ireland manager

Ayodhya Verdict: RSS, VHP’s Dos And Don’ts For Ayodhya Verdict: No Slogans, Sweets OK