How the Cancer Genomic Atlas will work

The Cancer Genomic Atlas will have a heavy information technology component as it crunches data to map the cancer genome.

By Larry Dignan, Contributor March 30, 2007 at 5:45 a.m. PT

1 of 9 Larry Dignan/ZDNET

The project will collect two samples from an individual. The first will be the tumor and the second will be normal tissue. These will be tracked via bar code and DNA tests.

2 of 9 Larry Dignan/ZDNET

Data from the samples will be compiled and combined with other information. The National Cancer Institute has made privacy a big priority.

3 of 9 Larry Dignan/ZDNET

Data management will become difficult given all the variables in cancer tissue.

4 of 9 Larry Dignan/ZDNET

Once the data is compiled it will be stored in a data coordination center. Middleware, specifically Jboss, will be the interface to various databases involved with the project.

5 of 9 Larry Dignan/ZDNET

Analyzing the information will take some computing horsepower. The Cancer Genomic Atlas plans to tap into other high performance computing grids to analyze the information.

6 of 9 Larry Dignan/ZDNET

A committee will determine who has access to specific data on each sample.

7 of 9 Larry Dignan/ZDNET

One challenge with managing these genetic sequences is agreeing on data definitions.

8 of 9 Larry Dignan/ZDNET

If this plays out like the project to map human DNA the public database will lead to more commercial development for cancer-targeting drugs.