Skip to contents

Overview

See the README.md in the main directory for details on the source data. It is from two different research projects (I believe the same group), but both are trying to assemble a list of surfaceome proteins/genes. CSPA (Cell Surface Protein Atlas) is an experimentally derived list of surface proteins. The other (SURFY) is machine learning inferred. This package loads these tables and makes them available in R. The files are available on their respective websites as well as in the publications.

CSPA

The Cell Surface Protein Atlas is an empirically determined set of surface proteins. We are using S2_File.xlsx from the website, which is the same as in the publication. There are a couple of sheets in the spreadsheet. Per the paper (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4404347/) we want Table A which is the list of human surfaceome proteins.

Note there isn’t much to do here, since each omics type will require it’s own mapping. We provide the default (gene) mapping which is what is comes with (RNASeq data should probably use the ENTREZ gene symbol field). Mapping to the U133 chip should involve the ENTREZ_gene_id field.

For convenience, I will create three different tables: - cspa the original - cspa_gene original with GENE (not sure about name) - cspa_u133plus2 original with probeset

Surfaceome Predictions

A research group has produced in silico predictions of surface proteins (https://wlab.ethz.ch/surfaceome/). This was published in PNAS (https://www.pnas.org/content/115/46/E10988). The resulting “Surfaceome” is available from supplemental data in that publication (https://www.pnas.org/highwire/filestream/834129/field_highwire_adjunct_files/1/pnas.1808790115.sd01.xls) as Tab "11.7_Surfaceome". This sheet is the same as https://wlab.ethz.ch/surfaceome/table_S3_surfaceome.xlsx spreadsheet, tab in silico surfaceome only.

The full PNAS publication “Dataset_S01” spreadsheet is also downloaded. At the moment, this is because Sheet 11.10 provides cleaner categorization of proteins into “Almen category” and “Almen subclass”.