Cancer treatment often focuses on organ of origin, but different types can occur in one organ. Gene expression provides valuable clues of the cancer type, but studying data manually is difficult. Instead, we use variational autoencoding, a deep learning method, to derive 36-dimensional feature space from 5000-dimensional gene space and show its efficacy in classification and a TSNE visualization.
While many other diseases are relatively predictable and treatable, cancer is very diverse and unpredictable, making diagnosis, treatment, and control extremely difficult. Traditional methods try to treat cancer based on the organ of origin in the body, such as breast or brain cancer, but this type of classification is often inadequate. If we are able to identify cancers based on their gene expressions, there is hope to find better medicines and treatment methods. However, gene expression data is so vast that humans cannot detect such patterns. In this project, the approach is to apply unsupervised deep learning to automatically identify cancer subtypes. In addition, we seek to organize patients based on their gene expression similarities, in order to make the recognition of similar patients easier.