Informasi

Buka database untuk variasi nomor salinan yang mirip dengan TCGA

Buka database untuk variasi nomor salinan yang mirip dengan TCGA


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Cancer Genome Atlas (TCGA) memiliki data terbuka untuk variasi nomor salinan (CNV) dari setidaknya 10k pasien kanker yang berbeda. Mereka menawarkan dua jenis data, data CNV dari tumor dan data CNV dari sampel jaringan normal. Apakah ada database terbuka lain yang menawarkan data CNV dari setidaknya satu jenis kanker?


ICGC memiliki data CNV untuk berbagai jenis kanker. Ini memiliki banyak kumpulan data terbatas dan terbuka. Halaman rilis DCC akan memungkinkan Anda menelusurinya - yang bersifat publik dapat dengan mudah diunduh. Mereka juga memiliki ekspresi yang cocok, SNV, metilasi DNA, dan data mutasi struktural untuk banyak sampel.


CODEX2: deteksi variasi nomor salinan spektrum penuh dengan sekuensing DNA throughput tinggi

Sekuensing DNA throughput tinggi memungkinkan deteksi variasi jumlah salinan (CNV) pada skala genom-lebar dengan resolusi yang lebih baik dibandingkan dengan metode berbasis array tetapi menderita bias dan artefak yang mengarah pada penemuan palsu dan sensitivitas rendah. Kami menggambarkan CODEX2, sebagai kerangka statistik untuk profil CNV spektrum penuh yang sensitif untuk varian dengan frekuensi populasi umum dan jarang dan yang berlaku untuk mempelajari desain dengan dan tanpa sampel kontrol negatif. Kami mendemonstrasikan dan mengevaluasi CODEX2 pada seluruh exome dan data sekuensing yang ditargetkan, di mana bias adalah yang paling menonjol. CODEX2 mengungguli metode yang ada dan, khususnya, secara signifikan meningkatkan sensitivitas untuk CNV umum.


Buka database untuk variasi nomor salinan yang mirip dengan TCGA - Biologi

Variasi urutan genom

http://www.1000genomes.org/
Pengumpulan data dan katalog variasi manusia

dbVar dan Database Varian Genom

Warisan Mendelian Online di Man

http://www.omim.org/about
OMIM adalah ringkasan lengkap dan otoritatif dari gen manusia dan fenotipe genetik yang tersedia secara bebas dan diperbarui setiap hari. Teks lengkap, tinjauan umum yang direferensikan di OMIM berisi informasi tentang semua gangguan mendelian yang diketahui dan lebih dari 12.000 gen. OMIM berfokus pada hubungan antara fenotipe dan genotipe. Ini diperbarui setiap hari, dan entri berisi banyak tautan ke sumber daya genetika lainnya.

Konsorsium Agregasi Exome (ExAC)

http://exac.broadinstitute.org/
ExAC adalah koalisi penyelidik yang berusaha mengumpulkan dan menyelaraskan data pengurutan exome dari berbagai proyek pengurutan skala besar, dan membuat ringkasan data tersedia untuk komunitas ilmiah yang lebih luas. Kumpulan data yang disediakan di situs web ini mencakup 61.486 individu yang tidak terkait yang diurutkan sebagai bagian dari berbagai studi genetik spesifik penyakit dan populasi. Kami telah menghilangkan individu yang terkena penyakit pediatrik parah, jadi kumpulan data ini harus berfungsi sebagai kumpulan referensi frekuensi alel yang berguna untuk studi penyakit parah. Semua data mentah dari proyek-proyek ini telah diproses ulang melalui jalur yang sama, dan secara bersama-sama disebut varian untuk meningkatkan konsistensi di seluruh proyek.

Proyek Encyclopedia Of DNA Elements (ENCODE)

http://encodeproject.org/
Tautan ke ENCODE2 data tanda histone yang diproses secara seragam: https://sites.google.com/site/anshulkundaje/projects/encodehistonemods
Tautan ke data ENCODE2 lainnya yang diproses secara seragam: http://genome.ucsc.edu/ENCODE/downloads.html
Pengumpulan data, analisis integratif, dan katalog lengkap
semua elemen fungsional berbasis urutan

Roadmap Epigenomics Project (NIH Common Fund)

Konsorsium Epigenom Manusia Internasional (IHEC)

http://www.ihec-epigenomes.org/
Pengumpulan data dan peta referensi epigenom manusia untuk kunci
keadaan seluler yang relevan dengan kesehatan dan penyakit

###Human BodyMap Dapat Dilihat dengan Ensemble (http://www.ensembl.org/index.html) atau
Penampil Genomics Terintegrasi (http://www.broadinstitute.org/igv/)
Basis data ekspresi gen dari Illumina, dari data RNA-seq

###Cancer CellLine Encyclopedia (CCLE) http://www.broadinstitute.org/ccle/home
Data ekspresi berbasis array, CNV, mutasi, gangguan pada kumpulan garis sel yang sangat banyak

###FANTOM5 Proyek http://fantom.gsc.riken.jp/
http://fantom.gsc.riken.jp/5/sstar/Data_source
Koleksi besar data ekspresi berbasis CAGE di beberapa spesies (deret waktu dan gangguan)

http://www.ebi.ac.uk/gxa/
Basis data yang mendukung kueri ekspresi gen khusus kondisi di
subset yang dikuratori dari Arsip Array Express.

Atlas Ekspresi Gen GNF

Dapat dilihat di BioGPS (http://biogps.org/#goto=welcome)
GNF (Genomics Institute of Novartis Research Foundation) data array ekspresi gen manusia dan tikus.

http://www.proteinatlas.org/
Profil ekspresi protein berdasarkan imunohistokimia untuk sejumlah besar jaringan manusia, kanker dan garis sel, lokalisasi subseluler, tingkat ekspresi transkrip

http://www.uniprot.org/
Basis data sekuens protein yang komprehensif dan dapat diakses secara bebas dan
informasi fungsional

http://www.ebi.ac.uk/interpro/
Database terintegrasi dari klasifikasi protein, domain fungsional,
dan anotasi (termasuk istilah GO).

Inisiatif Reagen Penangkap Protein

http://commonfund.nih.gov/proteincapture/
Pembuatan sumber daya: antibodi monoklonal terbarukan dan reagen lain yang menargetkan berbagai protein

Program Tikus Knockout (KOMP)

Peta Konektivitas (CMAP)

http://www.broadinstitute.org/cmap/
Peta Konektivitas (juga dikenal sebagai cmap) adalah kumpulan data ekspresi transkripsi genom luas dari sel manusia berbudaya diperlakukan dengan molekul kecil bioaktif dan algoritma pencocokan pola sederhana yang bersama-sama memungkinkan penemuan hubungan fungsional antara obat, gen dan penyakit melalui fitur sementara dari perubahan ekspresi gen yang umum. Anda dapat mempelajari lebih lanjut tentang cmap dari makalah kami di Science and Nature Reviews Cancer.

Perpustakaan Tanda Tangan Seluler Berbasis Jaringan Terpadu (LINCS)

https://commonfund.nih.gov/LINCS/
Pengumpulan data dan analisis tanda tangan molekuler yang menjelaskan bagaimana
berbagai jenis sel merespons berbagai agen yang mengganggu

Genomik sensitivitas obat pada kanker

http://www.cancerrxgene.org/
Mutasi, CNV, ekspresi Affy dan sensitivitas obat di

Database Interaksi Gen Obat (DGIdb)

Program Perpustakaan Molekuler (MLP)

https://commonfund.nih.gov/molecularlibraries/index.aspx
Akses ke kapasitas penyaringan skala besar yang diperlukan untuk mengidentifikasi molekul kecil yang dapat dioptimalkan sebagai probe kimia untuk mempelajari fungsi gen, sel, dan jalur biokimia dalam kesehatan dan penyakit

http://www.brain-map.org/
Pengumpulan data dan sumber daya publik online yang mengintegrasikan ekspresi gen ekstensif dan data neuroanatomi untuk manusia dan tikus, termasuk variasi ekspresi gen tikus menurut strain.

http://braincloud.jhmi.edu/
BrainCloud adalah aplikasi mandiri yang tersedia secara bebas, ramah-biologis, untuk menjelajahi dinamika temporal dan kontrol genetik transkripsi di korteks prefrontal manusia sepanjang masa hidup. BrainCloud dikembangkan melalui kolaborasi antara Lieber Institute dan NIMH

Proyek Hubungan Manusia

http://www.humanconnectomeproject.org/
Pengumpulan dan integrasi data untuk membuat peta lengkap dari koneksi saraf struktural dan fungsional, di dalam dan di seluruh individu

Proyek pengurutan RNA Geuvadis dari 1000 sampel Genom

http://www.geuvadis.org/web/geuvadis
mRNA dan pengurutan RNA kecil pada 465 sampel garis sel limfoblastoid (LCL) dari 5 populasi Proyek 1000 Genom: CEPH (CEU), Finlandia (FIN), Inggris (GBR), Toscani (TSI) dan Yoruba (YRI).

http://www.broadinstitute.org/achilles Proyek Achilles adalah upaya sistematis yang bertujuan untuk mengidentifikasi dan membuat katalog kerentanan genetik di ratusan garis sel kanker yang dicirikan secara genomik. Proyek ini menggunakan perpustakaan shRNA genom-lebar untuk membungkam gen individu dan mengidentifikasi gen-gen yang mempengaruhi kelangsungan hidup sel. Skrining fungsional skala besar dari garis sel kanker memberikan pendekatan pelengkap untuk studi yang bertujuan untuk mengkarakterisasi perubahan molekuler (mutasi, perubahan jumlah salinan, dll.) dari tumor primer, seperti The Cancer Genome Atlas. Tujuan keseluruhan dari proyek ini adalah untuk menghubungkan ketergantungan genetik kanker dengan karakteristik molekuler mereka untuk mengidentifikasi target molekuler dan memandu pengembangan terapeutik.

Sumber Daya Genom Penuaan Manusia

Atlas Genom Kanker (TCGA)

http://cancergenome.nih.gov/
Pengumpulan data dan penyimpanan data, termasuk data urutan genom kanker

Konsorsium Genom Kanker Internasional (ICGC)

http://www.icgc.org/
Pengumpulan data dan penyimpanan data untuk deskripsi komprehensif tentang perubahan genomik, transkriptomik, dan epigenomik kanker

Proyek Genotype-Tissue Expression (GTEx)

https://commonfund.nih.gov/GTEx/
Pengumpulan data, penyimpanan data, dan bank sampel untuk ekspresi dan regulasi gen manusia di berbagai jaringan, dibandingkan dengan variasi genetik

Program Fenotip Tikus Knockout (KOMP2)

https://commonfund.nih.gov/KOMP2/
Pengumpulan data untuk fenotip standar dari kumpulan seluruh genom tikus knockouts

Basis Data Genotipe dan Fenotipe (dbGaP)

http://www.ncbi.nlm.nih.gov/gap
Repositori data untuk hasil dari studi yang menyelidiki interaksi genotipe dan fenotipe

Katalog NHGRI dari GWAS yang Diterbitkan

http://www.genome.gov/gwastudies/
Katalog publik dari Studi Asosiasi Genome-Wide yang diterbitkan

Database Genom Klinis

http://research.nhgri.nih.gov/CGD/
Basis data kondisi yang dikuratori secara manual dengan penyebab genetik yang diketahui, dengan fokus pada data genetik yang signifikan secara medis dengan intervensi yang tersedia.

Inti informasi Kanker Payudara NHGRI

http://www.ncbi.nlm.nih.gov/clinvar/
ClinVar dirancang untuk menyediakan arsip laporan publik yang dapat diakses secara bebas tentang hubungan antara variasi dan fenotipe manusia, dengan bukti pendukung. ClinVar mengumpulkan laporan varian yang ditemukan dalam sampel pasien, pernyataan yang dibuat mengenai signifikansi klinisnya, informasi tentang pengirim, dan data pendukung lainnya. Alel yang dijelaskan dalam pengiriman dipetakan ke urutan referensi, dan dilaporkan sesuai dengan standar HGVS. ClinVar kemudian menyajikan data untuk pengguna interaktif serta mereka yang ingin menggunakan ClinVar dalam alur kerja sehari-hari dan aplikasi lokal lainnya. ClinVar bekerja sama dengan organisasi yang tertarik untuk memenuhi kebutuhan komunitas genetika medis seefisien dan seefektif mungkin.

Basis Data Mutasi Gen Manusia (HGMD)

http://www.hgmd.cf.ac.uk/ac/
Basis Data Mutasi Gen Manusia (HGMD®) mewakili upaya untuk menyusun lesi gen yang diketahui (diterbitkan) yang bertanggung jawab atas penyakit bawaan manusia

Server Varian Exome NHLBI Exome Sequencing Project (ESP)

http://evs.gs.washington.edu/EVS/
Tujuan dari NHLBI GO Exome Sequencing Project (ESP) adalah untuk menemukan gen dan mekanisme baru yang berkontribusi pada kelainan jantung, paru-paru, dan darah dengan memelopori penerapan pengurutan generasi berikutnya dari daerah pengkodean protein genom manusia di berbagai, kaya- populasi fenotipe dan untuk berbagi kumpulan data dan temuan ini dengan komunitas ilmiah untuk memperluas dan memperkaya diagnosis, manajemen, dan pengobatan kelainan jantung, paru-paru, dan darah.

http://ghr.nlm.nih.gov/
Genetika Rumah Referensi adalah situs web Perpustakaan Nasional Kedokteran untuk informasi konsumen tentang kondisi genetik dan gen atau kromosom yang terkait dengan kondisi tersebut.

http://www.ncbi.nlm.nih.gov/books/NBK1116/
GeneReviews adalah deskripsi penyakit peer-review yang ditulis oleh ahli yang disajikan dalam format standar dan berfokus pada informasi yang relevan secara klinis dan dapat ditindaklanjuti secara medis tentang diagnosis, manajemen, dan konseling genetik pasien dan keluarga dengan kondisi bawaan tertentu.

Jaringan Interaktif Asosiasi Alzheimer Global (GAAIN)

http://www.gaain.org/
Jaringan Interaktif Asosiasi Alzheimer Global (GAAIN) adalah proyek kolaboratif yang akan memberi para peneliti di seluruh dunia akses ke gudang besar data penelitian penyakit Alzheimer dan alat analitik canggih dan daya komputasi yang diperlukan untuk bekerja dengan data tersebut. Tujuan kami adalah untuk mengubah cara para ilmuwan bekerja sama untuk menjawab pertanyaan-pertanyaan kunci yang berkaitan dengan pemahaman penyebab, diagnosis, pengobatan dan pencegahan Alzheimer dan penyakit neurodegeneratif lainnya.
Pada tahun 2013, diperoleh data WGS untuk kohort terbesar dari 800 pasien Alzheimer

Kohort untuk Penelitian Jantung dan Penuaan dalam Konsorsium Genomic Epidemiology (CHARGE)

http://web.chargeconsortium.com/
The Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium dibentuk untuk memfasilitasi meta-analisis studi asosiasi genom dan peluang replikasi di antara beberapa studi kohort longitudinal besar dan fenotipe yang baik. Mereka juga memiliki data metilasi DNA bersama WGS dan Exome Seq.

Pusat NIMH untuk Studi Genom Kolaboratif tentang Gangguan Mental


Hasil

Profil epigenomik komprehensif di kedua jalur BLCA dan tumor primer

Dalam proyek ini, kami melakukan RNA-Seq, ChIP-Seq untuk Histon 3 lisin 27 asetilasi (H3K27ac), Assay untuk Transposase-Accessible Chromatin menggunakan sequencing (ATAC-Seq), dan eksperimen penangkapan konfirmasi kromatin lebar genom (Hi-C) pada 4 garis sel kanker kandung kemih (Gbr. 1a), dua di antaranya (RT4 dan SW780) sebelumnya dijelaskan sebagai luminal dan dua lainnya (SCABER dan HT1376) yang ditandai sebagai basal [8, 25]. Berdasarkan data RNA-Seq yang dihasilkan dalam penelitian ini, kami menggunakan pendekatan subtipe molekuler yang dilaporkan sebelumnya [26] untuk mengkonfirmasi penugasan ke keadaan luminal dan basal. Hasil kami mengkonfirmasi RT4 dan SW780 sebagai bagian dari subtipe Luminal-papiler, sedangkan SCABER dan HT1376 termasuk dalam subtipe Basal/skuamosa (File tambahan 1: Tabel S1). Setiap percobaan dalam garis sel kanker kandung kemih memiliki setidaknya dua ulangan biologis (File tambahan 2: Tabel S2) dan kami mengamati korelasi yang tinggi antara dua ulangan (File tambahan 3: Tabel S3). Lebih penting lagi, kami melakukan serangkaian eksperimen yang sama pada empat pasien tumor kandung kemih invasif otot juga. Dengan menggunakan metode subtipe molekuler yang sama, kami menentukan subtipenya sebagai berikut: T1 adalah papiler Luminal, T3 kaya Stroma, dan T4 dan T5 adalah basal/skuamosa.

Subtipe BLCA transkripsional luminal dan basal dikaitkan dengan aktivitas promotor dan penambah distal yang berbeda pada tingkat epigenetik. A Desain studi secara keseluruhan. B Analisis gen ekspresi diferensial (DEG) dari garis sel luminal (RT4 dan SW780) dan garis sel basal (SCABER dan HT1376) menunjukkan 427 gen upregulasi spesifik basal dan 524 gen upregulasi spesifik luminal. C Peta panas diferensial H3K27ac ChIP-Seq pada promotor (kiri). Profil intensitas sinyal H3K27ac untuk setiap kelompok sel BLCA (kanan). D Trek sinyal browser genom untuk panel gen luminal dan basal. Ditampilkan di sini adalah jejak data H3K27ac ChIP-Seq, ATAC-Seq, dan RNA-Seq dalam sel RT4, SW780, SCABER, dan HT1376. e Promotor H3K27ac dan sinyal RNA-Seq terkait untuk gen luminal dan basal yang dipilih menunjukkan kesamaan yang luar biasa. F Puncak H3K27ac terintegrasi pada penambah distal dan model asosiasi ekspresi gen RNA-Seq mengidentifikasi penambah diduga dan regulasi gen. 10.000 enhancer paling variabel teratas (peta panas kiri) diplot bersama dengan ekspresi gen yang sesuai (peta panas kanan). G Korelasi sinyal H3K27ac genom-lebar antara garis sel kanker kandung kemih dan sampel tumor menunjukkan kesamaan lanskap penambah

Subtipe BLCA transkripsional luminal dan basal dikaitkan dengan aktivitas promotor dan penambah distal yang berbeda pada tingkat epigenetik

Pengayaan sinyal H3K27ac telah digunakan untuk memprediksi promotor aktif dan enhancer distal [27, 28]. Oleh karena itu, kami pertama kali melakukan ChIP-Seq untuk H3K27ac di keempat tipe sel dan empat sampel pasien. Kami mengamati bahwa replikasi biologis mengikuti H3K27ac ChIP-seq selalu berkerumun bersama, menunjukkan hasil kami sangat dapat direproduksi (File tambahan 4: Gambar S1A). Lebih lanjut, kami menemukan bahwa dua subtipe luminal (RT4 dan SW780) mengelompok bersama, sementara dua garis sel basal (SCABER dan HT1376) juga dikelompokkan bersama (File tambahan 4: Gambar S1A). Hasil pengelompokan ini menunjukkan profil epigenomik global secara akurat mencerminkan identitas sel. Pengelompokan hierarkis dalam garis sel berdasarkan sinyal H3K27ac juga dicerminkan oleh ekspresi mRNA global oleh data RNA-Seq (File tambahan 4: Gambar S1B). Kami melakukan analisis ekspresi gen diferensial pada dua kelompok tipe sel (RT4 dan SW780 vs. SCABER dan HT1376) dan mengidentifikasi 427 spesifik basal (File tambahan 5: Tabel S4) dan 524 gen spesifik luminal (Gbr. 1b, File tambahan 6: Tabel S5).

Selanjutnya, kami memeriksa penggunaan promotor berdasarkan sinyal H3K27ac pada gen yang diketahui. Kami mengkonfirmasi bahwa intensitas promotor H3K27ac sangat mirip dengan ekspresi gen (Gbr. 1c), dan analisis pengelompokan berdasarkan intensitas promotor H3K27ac mampu membedakan model luminal dan basal BLCA (File tambahan 4: Gambar S1C). Sebagai contoh, kami mengamati bahwa dua garis sel BLCA subtipe luminal RT4 dan SW780 memiliki pola H3K27ac yang serupa pada gen luminal FOXA1, GATA3, dan PPARG (Gbr. 1d, e), sedangkan dua garis sel basal memiliki tanda promotor yang sama pada gen yang mengkode penanda basal/skuamosa KRT5/14. Menariknya, meskipun berdasarkan ekspresi gen global, HT1376 diklasifikasikan sebagai subtipe basal/skuamosa, ini menunjukkan pola H3K27ac promotor serupa pada gen luminal (GATA3, KRT7/8/18, Gambar 1e).

Puncak H3K27ac distal dari daerah promotor gen telah digunakan sebagai penanda untuk penambah aktif [27, 29]. Kami mengambil pendekatan yang sama di sini, dan rata-rata, kami memperkirakan 59.466 (40.731–78.506) enhancer di setiap baris sel (File tambahan 7: Tabel S6). Untuk menghubungkan penambah distal ke gen target mereka, kami melakukan asosiasi gen puncak penambah distal berbasis korelasi seperti yang dijelaskan dalam [30] dan mengidentifikasi 10.000 variabel penambah distal teratas yang menunjukkan korelasi signifikan dengan gen terkaitnya (korelasi 0,5, P < 0,01 total 58.509 memenuhi kriteria kami Gambar 1f dan File tambahan 8: Tabel S7). Kami mengamati bahwa enhancer menunjukkan pengelompokan yang jelas menurut tipe sel yang berbeda, dan gen target mereka menunjukkan pola spesifik tipe sel yang serupa (Gbr. 1f dan File tambahan 4: Gambar S1D). Selain itu, untuk memahami relevansi klinis dari temuan kami, kami melakukan H3K27ac ChIP-Seq dalam empat sampel pasien kandung kemih invasif otot. Hasil kami menunjukkan korelasi yang luar biasa dari garis sel tumor (Gbr. 1g). Singkatnya, kami menunjukkan dalam garis sel ini dan dalam kelompok tumor terbatas bahwa regulasi epigenetik berkorelasi dengan penetapan subtipe molekuler.

Kumpulan motif faktor transkripsi yang berbeda diperkaya dalam terkait BLCA luminal dan basal cis Daerah pengatur DNA

Kami melakukan ATAC-Seq dalam garis sel RT4, SW780, SCABER, dan HT1376 untuk mengevaluasi status kromatin terbuka mereka dalam genom. Rata-rata, di setiap baris sel, kami mengidentifikasi 32.000 daerah kromatin terbuka (Gbr. 2a dan File tambahan 9: Tabel S8). Di antara mereka, 40,8% daerah kromatin terbuka terletak di daerah promotor, sementara 59,2% terletak di daerah distal. Secara keseluruhan, > 90% dari daerah promotor kromatin terbuka tumpang tindih dengan H3K27ac (File tambahan 4: Gambar S2A, S2C–D). Tumpang tindih puncak ATAC-Seq distal dan H3K27ac lebih rendah (File tambahan 4: Gambar S2A dan File tambahan 10: Tabel S9), setidaknya sebagian karena jumlah puncak yang berbeda dalam kumpulan data yang berbeda. Korelasi luas genom ATAC-Seq menunjukkan bahwa HT1376 dan SCABER mengelompok bersama dengan kesamaan 80% (File tambahan 4: Gambar S2E) dibandingkan dengan luminal RT4 (

65%). Kami mencatat bahwa pengamatan ini sesuai dengan pengelompokan berbasis RNA-Seq dan pengelompokan berbasis H3K27ac (File tambahan 4: Gambar S1A dan B).

Kumpulan motif faktor transkripsi yang berbeda diperkaya dalam terkait BLCA luminal dan basal cis daerah pengatur DNA. A Satu set sinyal ATAC-Seq distal yang komprehensif dan berbeda pada tiga cluster (spesifik luminal, spesifik basal, dan bersama) dan sinyal H3K27ac yang sesuai. B Hasil analisis motif TF ditampilkan di sini sebagai plot peringkat (kiri) dan motif (kanan), di mana untuk enhancer kromatin terbuka spesifik luminal (atas) dan basal-skuamosa (bawah). C Kromatin terbuka terikat FOXA1 dan GATA3 yang terletak di penambah distal garis sel RT4/luminal digambarkan di sini dalam tiga kelompok: hanya FOXA1, hanya GATA3, dan situs pengikatan FOXA1 dan GATA3. D Analisis ontologi gen jalur untuk setiap kelompok situs pengikatan (hanya FOXA1, FOXA1 dan GATA3, dan hanya GATA3). e Terjadinya motif TF yang diamati (AP-1, FOX Forkhead, dan GATA) ditampilkan di sini di enhancer distal dan promotor dari tiga kelompok. F Kromatin terbuka lebar genom dari garis sel BLCA menunjukkan kesamaan dengan tumor kandung kemih TCGA [30]

Selanjutnya, kami melakukan analisis motif daerah kromatin terbuka ini (File tambahan 11: Tabel S10). Kami mengamati bahwa situs pengikatan untuk kompleks CTCF dan AP-1 diperkaya di semua lini sel (Gbr. 2b dan File tambahan 4: Gambar S2G). Peringkat lebih lanjut dari motif TF berdasarkan pengayaan P-nilai mengungkapkan daerah kromatin terbuka luminal (dibagi antara RT4 dan SW780) diperkaya dengan motif pengikatan untuk GRHL2, TP53, dan TP63 sementara kromatin terbuka basal (dibagi antara SCABER dan HT1376) diperkaya untuk faktor TEAD1/4 dan KLF (Gbr. 2b ) motif mengikat. GRHL2 [31] sebelumnya dilaporkan sebagai gen luminal, sehingga memvalidasi temuan kami. Menariknya, motif pengikatan untuk protein kompleks AP-1 FOSL1/2, JUN/JUNB, ATF3, dan BATF TF [32] adalah motif yang paling diperkaya untuk kromatin terbuka luminal dan basal-skuamosa. Kami kemudian secara komprehensif memetakan semua motif TF yang diperkaya dalam kromatin terbuka luminal, basal-skuamosa, dan berbagi penambah distal untuk memeriksa hubungan antara subtipe TF dan BLCA (File tambahan 11: Tabel S10). Kami menemukan bahwa pada enhancer distal, subtipe BLCA luminal dikaitkan dengan TF reseptor hormon steroid yang dilaporkan sebelumnya. Di sisi lain, area kromatin terbuka basal-skuamosa pada penambah menunjukkan pengayaan faktor-faktor yang sebelumnya tidak dilaporkan MADS box TF MEF2C dan homeobox TF OTX2. Tidak mengherankan, TF perintis luminal seperti faktor transkripsi forkhead (FOXA1/2/3, FOXF1, FOXK1, FOXM1), dan TF GATA (GATA3/4/6) diperkaya dalam enhancer terkait luminal dengan konformasi kromatin terbuka. Lebih mengejutkan lagi, motif forkhead dan GATA juga diidentifikasi terkait dengan kromatin terbuka pada elemen penambah di seluruh garis sel (File tambahan 11: Tabel S10). Sementara FOXA1 dan GATA3 diketahui memiliki ekspresi rendah pada garis sel kanker kandung kemih basal dan tumor, pengayaan motif forkhead dan GATA dalam kromatin terbuka di seluruh garis sel BLCA menunjukkan kompensasi oleh faktor Forkhead dan GATA selain FOXA1 dan GATA3. Selain itu, pengayaan motif Forkhead dan GATA melintasi garis sel di area kromatin terbuka dapat mengindikasikan TF spesifik luminal siap untuk mengikat area kromatin terbuka ini. Selanjutnya, FOXA1 dan GATA3 diketahui berperan dalam pengembangan urothelium [31] menunjukkan bahwa situs pengikatan mereka mungkin disiapkan lebih awal selama pengembangan. Kami juga menemukan bahwa TF perintis terkait sel punca seperti faktor KLF (KLF10/14), faktor ATF (ATF1/2/4/7), dan NANOG diperkaya dalam enhancer terkait basal. Ini menarik karena terdapat populasi sel progenitor dalam urothelium basal yang dapat berkontribusi pada perkembangan dan diferensiasi urothelial [33, 34].

FOXA1 dan GATA3 mengikat pada kromatin terbuka luminal pada penambah regulasi distal untuk mendorong ekspresi gen spesifik luminal

Kami berhipotesis bahwa TF seperti FOXA1 dan GATA3 mengikat di wilayah kromatin terbuka untuk merintis penambah luminal dan mengaktifkan ekspresi gen terkait. Untuk menguji hipotesis ini, kami melakukan GATA3 ChIP-Seq di garis sel BLCA luminal RT4 dan memperoleh FOXA1 ChIP-Seq dalam sel RT4 dari karya kami yang diterbitkan sebelumnya (File tambahan 12: Tabel S11) [8]. Seperti yang diperkirakan, TF luminal FOXA1 dan GATA3 menunjukkan pengikatan yang diperkaya pada lokus kromatin terbuka dari luminal terkait (FOXA1, GATA3, PPARG, FGFR3, dan FABP4) enhancer distal (Gbr. 2c). Lebih khusus, kami menemukan 1325 enhancer distal yang menunjukkan co-binding FOXA1 dan GATA3 di RT4 (Gbr. 2c). Demikian pula, FOXA1 dan GATA3 menunjukkan pengikatan yang diperkaya pada lokus kromatin terbuka dari gen penanda luminal (FOXA1, ERBB3, KRT19, GPX2, dan FABP4) promotor (File tambahan 4: Gambar S2F).

Analisis istilah GO dari gen proksimal ke situs penambah distal ini menunjukkan regulasi produksi beta TGF, pengembangan epitel, regulasi transkripsi yang terlibat dalam komitmen nasib sel, dan proses biologis adhesi sel-sel (pengikatan cadherin dan perakitan persimpangan patuh) sebagai istilah yang terkait dengan FOXA1 . Selain itu, regulasi komponen seluler, ukuran sel, dan proses biologis membran plasma apikal adalah istilah yang diidentifikasi dengan gen terikat GATA3 di proksimal penambah distal ini, menunjukkan keterlibatan kuat kedua TF dalam komitmen terhadap nasib sel dan diferensiasi luminal (Gbr. 2d ). Berkenaan dengan gen proksimal yang terkait dengan enhancer distal yang terikat oleh FOXA1 dan GATA3, istilah yang diidentifikasi terkait dengan berbagai proses perkembangan dan regulasi sekresi lendir dan diferensiasi sel lemak, keduanya merupakan atribut metabolik penting dari urothelium yang berdiferensiasi (Gbr. 2d).

Kami kemudian melanjutkan dengan analisis motif hanya FOXA1, GATA3 saja, dan situs yang terikat bersama. Anehnya, AP1-kompleks diperkaya secara khusus di semua enhancer distal selain motif FOXA atau GATA (Gbr. 2e). Urutan pengikatan ketiga faktor ini masih harus diselidiki. Akhirnya, untuk memahami relevansi klinis dari temuan kami, kami membandingkan empat garis sel BLCA kami dengan data ATAC-Seq tumor kandung kemih invasif otot TCGA [30] dan menemukan bahwa profil kromatin terbuka lebar genom di garis sel kami dikelompokkan dengan kelompok tumor yang berbeda (Gbr. 2f), menunjukkan bahwa daerah kromatin terbuka di garis sel ini memiliki pola yang sama dengan tumor pasien.

Subtipe luminal dan basal BLCA menunjukkan organisasi genom 3D yang berpotensi berbeda

Studi sebelumnya telah menunjukkan bahwa organisasi kromatin 3D dikaitkan dengan aktivasi epigenetik atau pembungkaman gen dalam sel [35]. Sebagai contoh, sebagian besar heterokromatin diketahui terkompresi dalam inti dan terletak di dekat pinggiran terkait lamina dari amplop nuklir [35]. Untuk mendapatkan wawasan awal tentang lanskap 3D genom luas dari BLCA luminal dan basal, kami melakukan eksperimen Hi-C resolusi tinggi pada keempat garis sel (masing-masing setidaknya 800 M bacaan) dan lima pasien tumor kandung kemih (> 800 M bacaan , masing-masing) (File tambahan 4: Gambar S3). Kami menggunakan perangkat lunak kami yang baru dikembangkan, Peakachu [36], yang merupakan pendekatan deteksi loop kromatin berbasis pembelajaran mesin, untuk memprediksi loop pada resolusi bin 10Kb. Pertama, kami mengidentifikasi rata-rata 56.315 loop (berkisar antara 38.271 dan 69.032) dalam empat baris sel (prob> 0,8 File tambahan 13: Tabel S12). Kemudian, dengan menggunakan keluaran skor probabilitas dari Peakachu, kami menetapkan loop kromatin spesifik subtipe seperti yang ditunjukkan dalam Analisis Puncak Agregat (APA, Gambar 3a dan File tambahan 14: Tabel S13) [37]. Berdasarkan pendekatan kami, kami mengamati loop spesifik luminal yang lebih berpotensi di RT4 dan SW780 (2299) relatif terhadap model BLCA basal SCABER dan HT1376 (2144). Kami kemudian membandingkan masing-masing kategori ini dengan loop yang terdeteksi pada lima sampel pasien (Gbr. 3b):

30–40% loop kromatin 3D yang ditentukan luminal dan basal yang diidentifikasi dalam garis sel diamati dalam lima sampel tumor ini.

Subtipe luminal dan basal kanker kandung kemih menunjukkan organisasi genom 3D yang berpotensi berbeda. A Analisis loop Hi-C dari garis sel luminal dan basal-squamous menunjukkan loop luminal yang berbeda dan loop basal-squamous. B Kontak yang diidentifikasi dalam garis sel luminal dan basal-skuamosa dibagikan dan divalidasi dalam lima sampel tumor kanker kandung kemih. C Trek peramban genom untuk gen luminal terpilih (FOXA1) dan gen basal (KRT5) yang mengandung loop enhancer-promotor ditampilkan di sini. Busur menunjukkan loop kromatin yang diprediksi menggunakan data Hi-C. D Jenis kontak berdasarkan tumpang tindih lokasi kontak baik di enhancer (H3K27ac di wilayah distal) atau promotor (H3K27ac dan H3K4me3 di promotor) di setiap baris sel ditampilkan. E-P, loop penambah-promotor E-E, loop penambah-peningkat P-P, loop promotor-promotor E-N, loop penambah-non-pengaturan P-N, loop promotor-non-pengaturan Tidak ada, loop non-regulasi. e Pengayaan situs pengikatan FOXA1 (sumbu kiri) dan GATA3 (sumbu kanan) dalam sel RT4 (luminal) ditunjukkan di sini di jangkar loopnya

Akhirnya, kami memeriksa loop penambah dan promotor di setiap kategori untuk hubungannya dengan ekspresi gen spesifik subtipe. Contohnya ditunjukkan pada Gambar. 3c, di mana kami menemukan bahwa gen luminal FOXA1 dan gen basal KRT5 menunjukkan peningkatan jumlah loop penambah-promotor di garis sel luminal dan basal, masing-masing. Secara keseluruhan, kami mengamati bahwa

40% dari loop kromatin ada antara enhancer dan promotor (Gbr. 3d). Selanjutnya, kami menemukan pengayaan signifikan situs pengikatan FOXA1 dan GATA3 pada jangkar loop ini, yang menunjukkan keterlibatan faktor pionir ini dalam regulasi genom 3D (Gbr. 3e). Temuan ini sesuai dengan penelitian sebelumnya yang melaporkan pengayaan situs pengikatan FOXA1 dalam loop enhancer-promotor [38].

Variasi nomor salinan (CNV) dan loop kromatin pada kanker kandung kemih

Ciri khas kanker adalah variasi struktural besar (SV), yang meliputi inversi, penghapusan, duplikasi, dan translokasi. Baru-baru ini, telah ditunjukkan bahwa perubahan pada CNV dan SV dapat menyebabkan perubahan dalam struktur genom 3D, termasuk pembentukan domain terkait topologi baru ("neo-TADs") [39] dan hasil "peningkatan pembajakan [40]." Neo-TAD mengacu pada skenario di mana peristiwa SV mengarah pada pembentukan domain kromatin baru, yang pada gilirannya dapat memengaruhi profil ekspresi gen yang terletak di wilayah tersebut. Dalam model “enhancer-hijacking”, organisasi genom 3D yang diubah menghasilkan interaksi enhancer yang tidak normal, dengan enhancer yang didekatkan dengan gen target yang salah (biasanya onkogen) yang mengakibatkan aktivasi target yang tidak tepat.

Kami pertama-tama secara sistematis mengidentifikasi variasi nomor salinan (CNV) dan kejadian SV menggunakan data Hi-C dengan perangkat lunak HiNT [41] dan Hi-Cbreakfinder [42]. Kami mengidentifikasi puluhan SV besar, termasuk inversi, penghapusan, dan translokasi (Gbr. 4a, b, File tambahan 4: Gambar S4-S5, File tambahan 15: Tabel 14). Seperti yang diharapkan, kami mengamati lebih sedikit CNV dalam sampel pasien daripada di garis sel. Lebih penting lagi, kami dapat merekonstruksi peta Hi-C lokal yang mengelilingi breakpoint SV. Kita dapat mengamati peristiwa pembajakan penambah yang menarik dan pembentukan neo-TAD di peta Hi-C lokal ini (Gbr. 4c-h). These observations provide an important resource to further study the function of the re-arranged enhancers in the context of bladder cancer.

Chromatin interactions induced by structure variation (SV) events. A, B Circos plot showing intra- and inter-chromosome SVs in SCABER (A) and SW780 (B). C A large intra-chromosomal translocation on chr9. DH Inter-chromosomal translocations. The breakpoints were identified by the HiCBreakfinder software. We then reconstructed the local Hi-C maps across the breakpoints. RNA-Seq and H3K27ac ChIP-Seq tracks from the same cell type are shown below the Hi-C maps

Neuronal PAS Domain Protein 2 (NPAS2) is a novel luminal BLCA TF which regulates luminal gene expression and cell migration

Genome-wide open chromatin analysis of BLCA cell lines provides an ideal platform for the identification of novel transcriptional regulators of BLCA cell fate and phenotype. Here we performed motif analysis of luminal-associated, basal-associated, and shared open chromatin regions, resulting in the identification of distinct TFs in each cluster. Among them, many represent known families of subtype-specific regulators, such as the GATA, FOX, and ETS families at luminal-associated ATAC-Seq peaks. Among them, we noticed a potential novel bHLH containing regulator, NPAS2, which is enriched in the luminal-associated and shared clusters, but not enriched in basal-associated ATAC-Seq peaks (Fig. 5a). We examined its binding profile using the latest ENCODE data (HEPG2 cells) [43] and found that NPAS2 binds at the FOXA1 promoter region (Fig. 5b), but not at regulatory regions for basal marker genes. This suggests the possibility that NPAS2 may be an upstream regulator of FOXA1. We then checked the TCGA data and found that high expression level of NPAS2 is significantly correlated to overall patient survival (Fig. 5c).

NPAS2 is a novel bladder cancer regulator. A P-values of NPAS2 motif in luminal-associated (RT4, SW780), basal-associated (SCABER, HT1376), and shared open chromatin regions. B NPAS2 ChIP-seq signal near luminal marker genes FOXA1, GATA3, dan PPARG in HEPG2 cell line. C NPAS2 Kaplan-Meier curve is shown here for 2000 days with log-rank statistics and hazards ratio. D Transwell migration assay representative crystal violet staining (left) and quantification of differences in transwell migration (right) are shown following overexpression of NPAS2 in SCABER. e RT-qPCR results for basal marker genes KRT5, KRT6A, STAT3, dan TFAP2C are shown here for wild-type and NPAS2 overexpressed SCABER basal cell line. F NPAS2, FOXA1/GATA3, dan PPARG RT-qPCR are shown here for wildtype and FOXA1/GATA3 overexpressed SCABER basal cell line

To further determine whether NPAS2 expression influences the downstream target expression and phenotype, we overexpressed NPAS2 in the basal-squamous BLCA cell line SCABER. First, we performed trans-well migration assays and found that overexpression of NPAS2 in SCABER cells decreased cell trans-well migration (Fig. 5d). We then performed RT-qPCR experiments and found that the basal marker genes (such as KRT5, KRT6A, dan TFAP2C) are significantly downregulated (Fig. 5e) following NPAS2 overexpression, suggesting NPAS2 represses the expression of a subset of basal marker genes.

Because our functional genomics analysis suggests that FOXA1 and GATA3 cooperate to regulate luminal target genes [8], we individually overexpressed FOXA1 and GATA3 in SCABER cells to test their ability to regulate NPAS2 expression. We observed increased expression of NPAS2 by both FOXA1 and GATA3 overexpression (Fig. 5f).


Discussions

Advances in single-cell technologies present new challenges and opportunities for making biological discovery. Single-cell studies often involve large numbers of cells, which are powerful at characterizing cellular heterogeneity, but small numbers of biological samples, which are underpowered for discovering common disease genes. It has been shown by recent genome-wide association analysis that it is possible to enable new discovery by performing association analysis at cell-type resolutions [55]. For cancer and genetic diseases driven by somatic mutations, being able to obtain genetic footprint at various time and conditions can enable discovery of genes responsible for disease progression and resistance to therapy.

However, it remains unclear what analytical strategies should be deployed to achieve the benefits. Even more challenging it gets when CNAs are being considered, as CNAs affect large regions of the genome and are difficult to trace using phylogenetics methods.

In our study, we demonstrated that it is possible to achieve the benefit by reconstructing copy number evolution history as a lineage tree, i.e., MEDALT, and performing permutation-based statistical analysis, i.e., LSA, to identify fitness-associated CNAs and genes.

We have learned several important lessons in our study.

First, it is important to perform accurate lineage tracing. Although the single-copy gain and loss model that we implemented in deriving MEDALTs is limited in complexity, it already performed substantially better than conventional phylogenetics algorithms such as MP that assumes infinite sites and NJ that employs naïve distance metrics, as shown in our simulation and in real data analysis. It is conceivable that further development of methodology that incorporates more complex genome evolution mechanisms such as chromothripsis [56] can lead to better results.

An important goal was to represent convergent evolution that is likely prevalent in the lens of CNAs [10, 57]. Conventional phylogenetics algorithms strictly prohibit the expression of convergent evolution by disallowing an alteration to occur multiple times in a course of evolution [28]. Several new algorithms relaxed such limitation but were designed for analyzing point mutation data [58]. As shown in our analysis of the TNBC patients, genes identified based on convergent evolution analysis (i.e., PLSA) had an even higher fraction of known cancer genes than those identified based on cohort-level single-lineage LSA. Our result suggests that examining convergent evolution is likely a key component towards fully unleashing the power of single-cell studies.

Unlike canonical phylogenetic trees, MEDALTs are minimal spanning trees that do not contain unobserved internal ancestral nodes. Representing evolution using minimal spanning trees instead of phylogenetics trees was our deliberate choice, as it allowed us to develop polynomial-runtime solutions that are scalable to real datasets containing thousands of cells. It also allowed us to conveniently implement biologically meaningful MED and enforce directionality constraints. Phylogenetics algorithms are likely effective when the numbers of cells are small and that the alterations are simple to trace. None of these conditions apply to available SCCN datasets that have CNAs evolving non-linearly in hundreds of cells. Moreover, we have shown in our simulation that for the purpose of detecting fitness-association alterations, our method outperformed phylogenetics approaches in a wide range of sample sizes.

A particular challenge in developing and evaluating computational lineage tracing methods is the lack of exact ground truth. Although various experimental technologies have been developed [59, 60], we are not aware of any that can be applied to trace copy number evolution in patient samples. To circumvent this, we utilized in silico simulation that mimics several prevalent CNA mechanisms to evaluate the accuracies of the reconstructed lineages and fitness-associated alterations. We also utilized longitudinal datasets on which we knew the biological stages of the cells to evaluate the chronological accuracy of the inference results. Although these strategies are unlikely sufficient to validate all the edges and lengths in the trees, they are objective and sufficient to discriminate various approaches.

Second, it is important to control biases in statistical inference. It is challenging to detect fitness-associated genes, as CNAs often affect a large number of genes and that the sample sizes are often small. Passenger CNAs that occur naturally in non-functional regions such as those near fragile sites or repeats could easily cloud the discovery. In addition, lineage tracing algorithms are unlikely to be perfect and could introduce distinct biases. To address these challenges, we employed LSA, which randomly permutes SCCN profiles into different cells to reduce the biases introduced by background genomic variations and technical noises. And we reconstructed trees from permutated datasets to alleviate biases introduced by the lineage tracing algorithms. The evolutionarily meaningful MED metrics and constraints help our analyses to focus on biologically relevant hypotheses, given limited computational resources. These procedures appeared important to achieve the accuracy. Further exploration of different ways to permute the data and to estimate the background distribution will likely lead to better results.

We assessed the functional impact of the identified genes using cell-line CRISPR essentiality screen data. We confirmed that the set of fitness-associated, amplified genes discovered by our methods are significantly more essential than other control gene sets in cancer cell lines. We also nominated novel genes that appear to have prognostic values in TCGA and the METABRIC datasets. These assessment strategies likely have false positives and negatives. Further comprehensive, well-controlled and targeted experiments will likely be required to fully assess the functional impact and clinical values of these genes.

Lastly, it was exciting to observe benefits of our methods on both the scDNA-seq and the scRNA-seq data. Although RNA-derived copy number profiles may not be as accurate as those derived from DNAs, previous studies [61] suggested that they can reasonably distinguish tumor clones. Our study further revealed the value of scRNA-seq data in lineage tracing and supported the notion that genomic profiles, even approximations, are more accurate than transcriptomic profiles in determining biological timing of cells. Our results opened doors towards utilizing scRNA-seq as a platform to understand genetics underlying developmental processes and perform gene discovery.


KESIMPULAN

The number of users proves that MEXPRESS, through its ease of use and unique, integrative data overview, found its place in the toolbox of many researchers. By combining a comprehensive visualization and statistical analysis in a single figure, MEXPRESS helps researchers quickly identify dysregulations and their clinical relevance in cancer. With this major, feedback-driven update, we aim to consolidate MEXPRESS’s place in the set of open source web tools available to researchers and clinicians.


Metode

Haploproficient genes and orthology analysis

The set of S.cerevisiae genes which are haploproficient in turbidostat culture was obtained using the growth data of [8] and an FDR cutoff of 0.02. This stringent FDR cut-off rigorously defines those genes for which heterozygosity confers a strong fitness advantage, but has no effect on the functional enrichment of genes identified as haploproficient. Genes defined as ‘haploproficient’ for the purposes of this study are listed in Additional file 1: Table S1. The set of chromosome maintenance-associated HP genes described in [8] overlaps, but is not coincident, with the HPGI set studied here, since the current set also includes DNA damage-response genes.

Orthology assignments were made using the InParanoid algorithm [50] and compared with the results of a BLAST [51] reciprocal best-hits search. GO enrichment searches were performed using the Babelomics 4 FatiGO tool [52]. To assess the significance of HP gene conservation, the number of HP genes having orthologs in a given Ascomycete species, given the number of S. cerevisiae HP genes, was compared against the whole-genome conserved proportion using a χ 2 or Fisher exact test (depending on sample size), with the null hypothesis of identical distribution. All findings of significance were reiterated using a Z test for difference of proportions. Where necessary, P values were corrected for multiple testing using the Bonferroni correction. Cell cycle and DNA damage repair pathways were obtained from the KEGG pathway database [53].

Expression data for S.cerevisae genes was obtained from the Saccharomyces Genome Database [54] and protein expression levels from [55]. A list of human cancer genes/oncogenes was obtained from the Cancer Gene Index [17] enrichment of HP genes amongst the orthologs was determined using a χ 2 test as above. CNV incidence across eight tumour types (breast invasive carcinoma, rectum adencarcinoma colon adenocarcinoma, kidney renal cell clear carcinoma, uterine corpus endometrioid carcinoma, glioblastoma multiforme, acute myeloid leukemia, lung adenocarcinoma, lung squamous cell carcinoma, serous cystadenocarcinoma) as measured by comparative genomic hybridisation, was obtained from the NCI Cancer Genome Atlas online data browser [17] with a copy number (log2 ratio) of magnitude >0.5 taken as the significance threshold. Details of the sampling and analysis of the tumour samples are described in [17]. A P-value for HP ortholog overrepresentation was calculated using a χ 2 test .The TGCA database was also used to perform a pathway search for overrepresentation of HP orthologs.

Yeast strains

In total, 30 HP genes were chosen for analysis, based upon the criteria discussed in the Results above. The heterozygous deletion mutant of each gene was obtained from the heterozygous diploid deletion library (Open Biosystems), in the BY4743 (TIKAR A /α, his3D1/his3D1, leu2D0/leu2D0, LYS2/lys2D0, met15D0/MET15, ura3D0/ura3D0) genetic background. For non-essential genes, the homozygous deletant was retrieved from the analogous homozygous diploid deletion library (Open Biosystems).

Control strains were the BY4743 WT, along with the heterozygous deletion mutant of the non-functional his3 locus the non-HP, non-cell cycle ho/HO heterozygous deletion strain and the heterozygous deletion mutant of the non-HP, cell cycle gene HSL1. In addition, heterozygous deletion mutants of the G1 and G2 cyclins were included in several of the experiments. A complete list of the strains used is provided in Additional file 6: Table S6.

Cell-cycle profiling

Flow cytometric analysis of the deletion strains’ cell cycle profiles was carried about following the method of [56]. Secara singkat,

10 7 cells in mid-exponential phase were harvested, washed, and fixed in absolute ethanol at 4C overnight. Fixed cells were then collected, washed, and boiled for 15 minutes in 2 mg/mL RNAse in 50 mM Tris-Cl (pH 8), and incubated at 37C for 2–12 hours. Cells were resuspended in protease solution (5 mg/mL pepsin, 4.5 μL/mL concentrated HCl), incubated for 15 minutes at 37C and resuspended in 50 mM Tris (pH 7.5). For analysis, 50 mL of cell suspension was added to 1 mL of 1 mM Sytox Green in 50 mM Tris pH 7.5), vortexed and analysed using a Cyan flow cytometer (Beckman Coulter). FlowJo (Tree Star) analysis software was used to fit histograms to the peaks representing 1C and 2C DNA content, and thereby calculate the number of cells in the G1 and G2 phases, and infer the number in S phase from the remaining fraction of the population.

Chronological lifespan assay

Cultures were inoculated from frozen stocks, grown overnight in YPD at 3°C, and 200mL of each was transferred into a well of a 96-well microtiter plate (Corning). Strains were present in duplicate on each plate, with a buffer of WT in the wells around the edge of the plate, so edge effects would not impact test colony measurements. A Singer Rotor HDA colony pinning robot was used to spot four replicates of each well onto a YPD + 10 μg/mL phloxine B (Sigma) plate. Phloxine B is a fluorescein derivative taken up when the cell membrane is disrupted upon cell death [57]. Plates were incubated for 48 hours at 3°C and photographed using an Epson 1240 Scanner. The colony images were analysed using a custom image-analysis code written in MatLab, with colony size measured by pixel count, and fraction of dead cells by the intensity of colony redness [10]. Since these parameters are independent, this allowed the dissection of the effect of cell viability upon colony growth from that of growth rate variation. The 96-well liquid cultures were incubated at 3°C, and, every second day over a period of three weeks, the colony-pinning onto YPD + phloxine B and image analysis repeated. For each plate, the median culture intensity for each strain was compared with the growth of the WT on that plate, and also with the strain growth and viability after the initial 48-hour period. The experiment was performed twice.

At several points throughout the 3-week period, several strains were selected at random, and viability assayed by performing serial dilutions and counting colony-forming units. These results were checked for compatibility with the microplate viability results.

Apoptosis assays

The rate of occurrence of apoptosis in the different strain populations was measured in two ways. Apoptosis was first induced by pretreating cells with 0.001%, 0.01% MMS, 0.0001% or 0.001% TBHP in overnight culture keeping a negative, non-induced WT control sample.

The translocation of phosphatidyl serine to the cell surface, a marker of apoptosis [58], was measured using an Annexin V-FITC Apoptosis Detection kit. (Sigma). Cells were harvested, washed in 1.2M sorbitol, 0.5 mM MgCl2, 35 mM K phosphate (pH 6.8) and then digested in 5.5% glusulase (Sigma) and 15 U/mL lyticase (Sigma) for 2 hours at 28C. Spheroplasts were harvested, washed in binding buffer (10 mM Hepes/NaOH pH 7.4, 140 mM NaCl, 2.5 mM CaCl2 in 1.2 M sorbitol buffer) and resuspended in binding buffer/sorbitol. 5 mL of FITC-labelled annexin V, and 10 mL of 10010 mg/mL propidium iodide were added to each sample, with control samples containing 1.) no label, 2.) FITC-annexin V only, and 3.) PI only. Fluorescence was quantified using a CyAn (Beckman Coulter). Gates were fitted on the basis of the the control samples, dividing a log PI versus log FITC plot into four quadrants: lower left (neither FITC nor PI-stained) – viable cells upper left (PI stain only) – necrotic cells lower right (FITC only) – early apoptotic cells and upper right (PI and FITC-stained) – late apoptotic cells. FlowJo software (TreeStar) was used to count the fraction of the total cell population in each quadrant. The proportion of both necrotic and apoptotic cells for each strain was normalised to strain viability (i.e. on the basis of the proportion of cells assigned to the lower-left FITC/PI quadrant), and the ratio of necrotic:apoptotic cells calculated. Ratios for each strain were normalised to the WT value, and the standard deviation across all samples calculated. Strains having a necrosis:apoptosis ratio further than 1.5x this standard deviation from WT levels were deemed to demonstrate abnormal apoptosis rates.

Growth rate and drug sensitivity assays

Growth and drug sensitivity assays were performed both on solid media and in liquid cultures. For solid assays, the required drug concentration was added to YPD-agar containing 10μg/m/mL phloxine B. Overnight cultures of the strains were spotted onto the (drug-containing) plates using a Singer rotor, as above. Plates were incubated at 3°C and photographed at 24 and 48 hours and analysed using an image-processing code as described above. Strain growth and viability was compared both with WT growth on the same plate, and with growth on YPD-agar (or YPD-agar plus DMSO, where the drug is DMSO-soluble). The ratio of viability and size with and without drug was calculated for every strain on a plate, and the standard deviation of all ratios calculated. Strains having a drug:untreated ratio greater than or less than two standard deviations from that of the WT were deemed to be resistant and sensitive, respectively.

Assays in liquid culture were performed by transferring 5mL of overnight culture into each well of a 96-well microtitre plate, containing 200 μL of YPD plus the required concentration of drug. Absorbance was measured for 30 hours at 3°C using a BMG Optima platereader, maximum growth rate calculated using a curve-fitting script written in R, and the growth rate for each strain compared with that of the WT in the same plate, and growth in YPD/YPD + DMSO.


Referensi

Yi K, Ju Y. Patterns and mechanisms of structural variations in human cancer. Exp Mol Med. 201850:98.

Yang L, Luquette L, Gehlenborg N, Xi R, Haseley P, Hsieh C, Zhang C, Ren X, Protopopov A, Chin L, et al. Diverse mechanisms of somatic structural variations in human cancer genomes. Sel. 2013153:919–29.

Zhang Y, Yang L, Kucherlapati M, Chen F, Hadjipanayis A, Pantazi A, Bristow C, Lee E, Mahadeshwar H, Tang J, et al. A pan-cancer compendium of genes deregulated by somatic genomic rearrangement across more than 1,400 cases. Cell Rep. 201824:515–27.

Campbell P, Getz G, Stuart J, Korbel J, Stein L. Pan-cancer analysis of whole genomes. Preprint at. 2017. https://doi.org/10.1101/162784.

Zhang Y, Chen F, Fonseca N, He Y, Fujita M, Nakagawa H, Zhang Z, Brazma A, Creighton C. Whole genome and RNA sequencing of 1,220 cancers reveals hundreds of genes deregulated by rearrangement of cis-regulatory elements. Preprint at. 2017. https://doi.org/10.1101/099861.

Deaton A, Bird A. CpG islands and the regulation of transcription. Pengembang Gen. 201125:1010–22.

Bird A. DNA methylation patterns and epigenetic memory. Pengembang Gen. 200216:6–21.

Pfeifer G. Defining driver DNA methylation changes in human cancer. Int J Mol Sci. 201819:E1166.

Morano A, Angrisano T, Russo G, Landi R, Pezone A, Bartollino S, Zuchegna C, Babbio F, Bonapace I, Allen B, et al. Targeted DNA methylation by homology-directed repair in mammalian cells. Transcription reshapes methylation on the repaired gene. Asam Nukleat Res. 201442:804–21.

Russo G, Landi R, Pezone A, Morano A, Zuchegna C, Romano A, Muller M, Gottesman M, Porcellini A, Avvedimento E. DNA damage and repair modify DNA methylation and chromatin domain of the targeted locus: mechanism of allele methylation polymorphism. Sci Rep. 20166:33222.

Allen B, Pezone A, Porcellini A, Muller M, Masternak M. Non-homologous end joining induced alterations in DNA methylation: a source of permanent epigenetic change. Oncotarget. 20178:40359–72.

Sun W, Bunn P, Jin C, Little P, Zhabotynsky V, Perou C, Hayes D, Chen M, Lin D. The association between copy number aberration, DNA methylation and gene expression in tumor samples. Asam Nukleat Res. 201846:3009–18.

Davis C, Ricketts C, Wang M, Yang L, Cherniack A, Shen H, Buhay C, Kang H, Kim S, Fahey C, et al. The somatic genomic landscape of chromophobe renal cell carcinoma. Cancer Cell. 201426:319–30.

Forbes S, Beare D, Boutselakis H, Bamford S, Bindal N, Tate J, Cole C, Ward S, Dawson E, Ponting L, et al. COSMIC: somatic cancer genetics at high-resolution. Asam Nukleat Res. 201745:D777–83.

Lawrence M, Stojanov P, Mermel C, Robinson J, Garraway L, Golub T, Meyerson M, Gabriel S, Lander E, Getz G. Discovery and saturation analysis of cancer genes across 21 tumour types. Alam. 2014505:495–501.

Chen F, Zhang Y, Gibbons D, Deneen B, Kwiatkowski D, Ittmann M, Creighton C. Pan-cancer molecular classes transcending tumor lineage across 32 cancer types, multiple data platforms, and over 10,000 cases. Klinik Kanker Res. 201824:2182–93.

Storey JD, Tibshirani R. Statistical significance for genomewide studies. Proc Natl Acad Sci U S A. 2003100:9440–5.

Hu X, Wang Q, Tang M, Barthel F, Amin S, Yoshihara K, Lang F, Martinez-Ledesma E, Lee S, Zheng S, Verhaak R. TumorFusions: an integrative resource for cancer-associated transcript fusions. Asam Nukleat Res. 201846:D1144–9.

Peifer M, Hertwig F, Roels F, Dreidax D, Gartlgruber M, Menon R, Krämer A, Roncaioli J, Sand F, Heuckmann J, et al. Telomerase activation by genomic rearrangements in high-risk neuroblastoma. Alam. 2015526:700–4.

Creighton C, Hernandez-Herrera A, Jacobsen A, Levine D, Mankoo P, Schultz N, Du Y, Zhang Y, Larsson E, Sheridan R, et al. Integrated analyses of microRNAs demonstrate their widespread influence on gene expression in high-grade serous ovarian carcinoma. PLoS Satu. 20127:e34546.

Ungewiss C, Rizvi Z, Roybal J, Peng D, Gold K, Shin D, Creighton C, Gibbons D. The microRNA-200/Zeb1 axis regulates ECM-dependent β1-integrin/FAK signaling, cancer cell invasion and metastasis through CRKL. Sci Rep. 20166:18652.

Kiuru-Kuhlefelt S, Sarlomo-Rikala M, Larramendy M, Söderlund M, Hedman K, Miettinen M, Knuutila S. FGF4 and INT2 oncogenes are amplified and expressed in Kaposi’s sarcoma. Mod Pathol. 200013:433–7.

Weischenfeldt J, Dubash T, Drainas A, Mardin B, Chen Y, Stütz A, Waszak S, Bosco G, Halvorsen A, Raeder B, et al. Pan-cancer analysis of somatic copy-number alterations implicates IRS4 and IGF2 in enhancer hijacking. Nat Gen. 201749:65–74.

Godinho M, Meijer D, Setyono-Han B, Dorssers L, van Agthoven T. Characterization of BCAR4, a novel oncogene causing endocrine resistance in human breast cancer cells. J Cell Physiol. 2011226:1741–9.

Kim J, Piao H, Kim B, Yao F, Han Z, Wang Y, Xiao Z, Siverly A, Lawhon S, Ton B, et al. Long noncoding RNA MALAT1 suppresses breast cancer metastasis. Nat Gen. 201850:1705–15.

Yang X, Han H, De Carvalho D, Lay F, Jones P, Liang G. Gene body methylation can alter gene expression and is a therapeutic target in cancer. Cancer Cell. 201426:577–90.

Dixon J, Selvaraj S, Yue F, Kim A, Li Y, Shen Y, Hu M, Liu J, Ren B. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Alam. 2012485:376–80.

Andersson R, Gebhard C, Miguel-Escalada I, Hoof I, Bornholdt J, Boyd M, Chen Y, Zhao X, Schmidl C, Suzuki T, et al. An atlas of active enhancers across human cell types and tissues. Alam. 2014507:455–61.

Taylor A, Shih J, Ha G, Gao G, Zhang X, Berger A, Schumacher S, Wang C, Hu H, Liu J, et al. Genomic and functional approaches to understanding cancer aneuploidy. Cancer Cell. 201833:676–89.

Knijnenburg T, Wang L, Zimmermann M, Chambwe N, Gao G, Cherniack A, Fan H, Shen H, Way G, Greene C, et al. Genomic and molecular landscape of DNA damage repair deficiency across The Cancer Genome Atlas. Cell Rep. 201823:239–54 1.

Bindea G, Mlecnik B, Tosolini M, Kirilovsky A, Waldner M, Obenauf A, Angell H, Fredriksen T, Lafontaine L, Berger A, et al. Spatiotemporal dynamics of intratumoral immune cells reveal the immune landscape in human cancer. Kekebalan. 201339:782–95.

Thorsson V, Gibbs D, Brown S, Wolf D, Bortone D, Ou Yang T, Porta-Pardo E, Gao G, Plaisier C, Eddy J, et al. The immune landscape of cancer. Kekebalan. 201848:812–30.

Mermel CH, Schumacher SE, Hill B, Meyerson ML, Beroukhim R, Getz G. GISTIC2.0 facilitates sensitive and confident localization of the targets of focal somatic copy-number alteration in human cancers. Biola genom. 201112:R41.

Alaei-Mahabadi B, Bhadury J, Karlsson J, Nilsson J, Larsson E. Global analysis of somatic structural genomic alterations and their impact on gene expression in diverse human cancers. Proc Natl Acad Sci U S A. 2016113:13768–73.

Drier Y, Lawrence M, Carter S, Stewart C, Gabriel S, Lander E, Meyerson M, Beroukhim R, Getz G. Somatic rearrangements across cancer reveal classes of samples with distinct patterns of DNA breakage and rearrangement-induced hypermutability. Res. Genom 201323:228–35.

Esteller M. Epigenetics in cancer. N Engl J Med. 2008358:1148–59.

Eden A, Gaudet F, Waghmare A, Jaenisch R. Chromosomal instability and tumors promoted by DNA hypomethylation. Sains. 2003300:455.

Coarfa C, Pichot C, Jackson A, Tandon A, Amin V, Raghuraman S, Paithankar S, Lee A, McGuire S, Milosavljevic A. Analysis of interactions between the epigenome and structural mutability of the genome using Genboree Workbench tools. Bioinformatika BMC. 201415(Suppl 7):S2.

Hajkova P, Jeffries S, Lee C, Miller N, Jackson S, Surani M. Genome-wide reprogramming in the mouse germ line entails the base excision repair pathway. Sains. 2010329:78–82.

Laird P, Jaenisch R. DNA methylation and cancer. Hum Mol Gen. 19943 Spec No:1487–95.

James S, Pogribny I, Pogribna M, Miller B, Jernigan S, Melnyk S. Mechanisms of DNA damage, DNA hypomethylation, and tumor progression in the folate/methyl-deficient rat model of hepatocarcinogenesis. J Nutr. 2003133:3740S–7S.

Yung C, O'Connor B, Yakneen S, Zhang J, Ellrott K, Kleinheinz K, Miyoshi N, Raine K, Royo R, Saksena G, et al. Large-scale uniform analysis of cancer whole genomes in multiple computing environments. Preprint at. 2017. https://doi.org/10.1101/161638.

Wala J, Shapira O, Li Y, Craft D, Schumacher S, Imielinski M, Haber J, Roberts N, Yao X, Stewart C, et al. Selective and mechanistic sources of recurrent rearrangements across the cancer genome. Preprint at. 2017. https://doi.org/10.1101/187609.

Chen K, Wallis J, McLellan M, Larson D, Kalicki J, Pohl C, McGrath S, Wendl M, Zhang Q, Locke D, et al. BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Metode Nat. 20096:677–81.

Chen F, Zhang Y, Şenbabaoğlu Y, Ciriello G, Yang L, Reznik E, Shuch B, Micevic G, De Velasco G, Shinbrot E, et al. Multilevel genomics-based taxonomy of renal cell carcinoma. Cell Rep. 201614:2476–89.

Lee A, Ewing A, Ellrott K, Hu Y, Houlahan K, Bare J, Espiritu S, Huang V, Dang K, Chong Z, et al. Combining accurate tumor genome simulation with crowdsourcing to benchmark somatic structural variant detection. Biola genom. 201819:188.

Fonseca N, Kahles A, Lehmann K-V, Calabrese C, Chateigner A, Davidson N, Demircioğlu D, He Y, Lamaze F, Li S, et al. Pan-cancer study of heterogeneous RNA aberrations. Preprint at. 2017. https://doi.org/10.1101/183889.

The_Cancer_Genome_Atlas_Research_Network. Comprehensive molecular characterization of clear cell renal cell carcinoma. Alam. 2013499:43–9.

Johnson W, Rabinovic A, Li C. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics. 20078:118–27.

Hoadley K, Yau C, Hinoue T, Wolf D, Lazar A, Drill E, Shen R, Taylor A, Cherniack A, Thorsson V, et al. Cell-of-origin patterns dominate the molecular classification of 10,000 tumors from 33 types of cancer. Sel. 2018173:291–304.

McCarroll S, Kuruvilla F, Korn J, Cawley S, Nemesh J, Wysoker A, Shapero M, de Bakker P, Maller J, Kirby A, et al. Integrated detection and population genetic analysis of SNPs and copy number variation. Nat Gen. 200840:1166–74.

Gerstung M, Jolly C, Leshchiner I, Dentro S, Rosado S, Rosebrock D, Mitchell T, Rubanova Y, Anur P, Yu K, et al. The evolutionary history of 2,658 cancers. Preprint at. 2018. https://doi.org/10.1101/161562.

Xie C, Leung Y, Chen A, Long D, Hoyo C, Ho S. Differential methylation values in differential methylation analysis. Bioinformatika. 201935:1094–7.

Creighton C, Nagaraja A, Hanash S, Matzuk M, Gunaratne P. A bioinformatics tool for linking gene expression profiling results with public databases of microRNA target predictions. RNA. 200814:2290–6.

Saldanha AJ. Java Treeview--extensible visualization of microarray data. Bioinformatika. 200420:3246–8.

Zhang Y, Yang L, Kucherlapati M, Chen F, Hadjipanayis A, Pantazi A, Bristow C, Lee E, Mahadeshwar H, Tang J, et al. R-code for linear models integrating expression data with somatic structural data. Github. 2019 https://github.com/chadcreighton/SV-expression_integration.


Metode

Haploproficient genes and orthology analysis

The set of S.cerevisiae genes which are haploproficient in turbidostat culture was obtained using the growth data of [8] and an FDR cutoff of 0.02. This stringent FDR cut-off rigorously defines those genes for which heterozygosity confers a strong fitness advantage, but has no effect on the functional enrichment of genes identified as haploproficient. Genes defined as ‘haploproficient’ for the purposes of this study are listed in Additional file 1: Table S1. The set of chromosome maintenance-associated HP genes described in [8] overlaps, but is not coincident, with the HPGI set studied here, since the current set also includes DNA damage-response genes.

Orthology assignments were made using the InParanoid algorithm [50] and compared with the results of a BLAST [51] reciprocal best-hits search. GO enrichment searches were performed using the Babelomics 4 FatiGO tool [52]. To assess the significance of HP gene conservation, the number of HP genes having orthologs in a given Ascomycete species, given the number of S. cerevisiae HP genes, was compared against the whole-genome conserved proportion using a χ 2 or Fisher exact test (depending on sample size), with the null hypothesis of identical distribution. All findings of significance were reiterated using a Z test for difference of proportions. Where necessary, P values were corrected for multiple testing using the Bonferroni correction. Cell cycle and DNA damage repair pathways were obtained from the KEGG pathway database [53].

Expression data for S.cerevisae genes was obtained from the Saccharomyces Genome Database [54] and protein expression levels from [55]. A list of human cancer genes/oncogenes was obtained from the Cancer Gene Index [17] enrichment of HP genes amongst the orthologs was determined using a χ 2 test as above. CNV incidence across eight tumour types (breast invasive carcinoma, rectum adencarcinoma colon adenocarcinoma, kidney renal cell clear carcinoma, uterine corpus endometrioid carcinoma, glioblastoma multiforme, acute myeloid leukemia, lung adenocarcinoma, lung squamous cell carcinoma, serous cystadenocarcinoma) as measured by comparative genomic hybridisation, was obtained from the NCI Cancer Genome Atlas online data browser [17] with a copy number (log2 ratio) of magnitude Ϡ.5 taken as the significance threshold. Details of the sampling and analysis of the tumour samples are described in [17]. A P-value for HP ortholog overrepresentation was calculated using a χ 2 test .The TGCA database was also used to perform a pathway search for overrepresentation of HP orthologs.

Yeast strains

In total, 30 HP genes were chosen for analysis, based upon the criteria discussed in the Results above. The heterozygous deletion mutant of each gene was obtained from the heterozygous diploid deletion library (Open Biosystems), in the BY4743 (TIKARA/α, his3D1/his3D1, leu2D0/leu2D0, LYS2/lys2D0, met15D0/MET15, ura3D0/ura3D0) genetic background. For non-essential genes, the homozygous deletant was retrieved from the analogous homozygous diploid deletion library (Open Biosystems).

Control strains were the BY4743 WT, along with the heterozygous deletion mutant of the non-functional his3 locus the non-HP, non-cell cycle ho/HO heterozygous deletion strain and the heterozygous deletion mutant of the non-HP, cell cycle gene HSL1. In addition, heterozygous deletion mutants of the G1 and G2 cyclins were included in several of the experiments. A complete list of the strains used is provided in Additional file 6: Table S6.

Cell-cycle profiling

Flow cytometric analysis of the deletion strains’ cell cycle profiles was carried about following the method of [56]. Secara singkat,

10 7 cells in mid-exponential phase were harvested, washed, and fixed in absolute ethanol at 4C overnight. Fixed cells were then collected, washed, and boiled for 15 minutes in 2 mg/mL RNAse in 50 mM Tris-Cl (pH 8), and incubated at 37C for 2� hours. Cells were resuspended in protease solution (5 mg/mL pepsin, 4.5 μL/mL concentrated HCl), incubated for 15 minutes at 37C and resuspended in 50 mM Tris (pH 7.5). For analysis, 50 mL of cell suspension was added to 1 mL of 1 mM Sytox Green in 50 mM Tris pH 7.5), vortexed and analysed using a Cyan flow cytometer (Beckman Coulter). FlowJo (Tree Star) analysis software was used to fit histograms to the peaks representing 1C and 2C DNA content, and thereby calculate the number of cells in the G1 and G2 phases, and infer the number in S phase from the remaining fraction of the population.

Chronological lifespan assay

Cultures were inoculated from frozen stocks, grown overnight in YPD at 3ଌ, and 200mL of each was transferred into a well of a 96-well microtiter plate (Corning). Strains were present in duplicate on each plate, with a buffer of WT in the wells around the edge of the plate, so edge effects would not impact test colony measurements. A Singer Rotor HDA colony pinning robot was used to spot four replicates of each well onto a YPD +� μg/mL phloxine B (Sigma) plate. Phloxine B is a fluorescein derivative taken up when the cell membrane is disrupted upon cell death [57]. Plates were incubated for 48 hours at 3ଌ and photographed using an Epson 1240 Scanner. The colony images were analysed using a custom image-analysis code written in MatLab, with colony size measured by pixel count, and fraction of dead cells by the intensity of colony redness [10]. Since these parameters are independent, this allowed the dissection of the effect of cell viability upon colony growth from that of growth rate variation. The 96-well liquid cultures were incubated at 3ଌ, and, every second day over a period of three weeks, the colony-pinning onto YPD + phloxine B and image analysis repeated. For each plate, the median culture intensity for each strain was compared with the growth of the WT on that plate, and also with the strain growth and viability after the initial 48-hour period. The experiment was performed twice.

At several points throughout the 3-week period, several strains were selected at random, and viability assayed by performing serial dilutions and counting colony-forming units. These results were checked for compatibility with the microplate viability results.

Apoptosis assays

The rate of occurrence of apoptosis in the different strain populations was measured in two ways. Apoptosis was first induced by pretreating cells with 0.001%, 0.01% MMS, 0.0001% or 0.001% TBHP in overnight culture keeping a negative, non-induced WT control sample.

The translocation of phosphatidyl serine to the cell surface, a marker of apoptosis [58], was measured using an Annexin V-FITC Apoptosis Detection kit. (Sigma). Cells were harvested, washed in 1.2M sorbitol, 0.5 mM MgCl2, 35 mM K phosphate (pH 6.8) and then digested in 5.5% glusulase (Sigma) and 15 U/mL lyticase (Sigma) for 2 hours at 28C. Spheroplasts were harvested, washed in binding buffer (10 mM Hepes/NaOH pH 7.4, 140 mM NaCl, 2.5 mM CaCl2 in 1.2 M sorbitol buffer) and resuspended in binding buffer/sorbitol. 5 mL of FITC-labelled annexin V, and 10 mL of 10010 mg/mL propidium iodide were added to each sample, with control samples containing 1.) no label, 2.) FITC-annexin V only, and 3.) PI only. Fluorescence was quantified using a CyAn (Beckman Coulter). Gates were fitted on the basis of the the control samples, dividing a log PI versus log FITC plot into four quadrants: lower left (neither FITC nor PI-stained) – viable cells upper left (PI stain only) – necrotic cells lower right (FITC only) – early apoptotic cells and upper right (PI and FITC-stained) – late apoptotic cells. FlowJo software (TreeStar) was used to count the fraction of the total cell population in each quadrant. The proportion of both necrotic and apoptotic cells for each strain was normalised to strain viability (i.e. on the basis of the proportion of cells assigned to the lower-left FITC/PI quadrant), and the ratio of necrotic:apoptotic cells calculated. Ratios for each strain were normalised to the WT value, and the standard deviation across all samples calculated. Strains having a necrosis:apoptosis ratio further than 1.5x this standard deviation from WT levels were deemed to demonstrate abnormal apoptosis rates.

Growth rate and drug sensitivity assays

Growth and drug sensitivity assays were performed both on solid media and in liquid cultures. For solid assays, the required drug concentration was added to YPD-agar containing 10μg/m/mL phloxine B. Overnight cultures of the strains were spotted onto the (drug-containing) plates using a Singer rotor, as above. Plates were incubated at 3ଌ and photographed at 24 and 48 hours and analysed using an image-processing code as described above. Strain growth and viability was compared both with WT growth on the same plate, and with growth on YPD-agar (or YPD-agar plus DMSO, where the drug is DMSO-soluble). The ratio of viability and size with and without drug was calculated for every strain on a plate, and the standard deviation of all ratios calculated. Strains having a drug:untreated ratio greater than or less than two standard deviations from that of the WT were deemed to be resistant and sensitive, respectively.

Assays in liquid culture were performed by transferring 5mL of overnight culture into each well of a 96-well microtitre plate, containing 200 μL of YPD plus the required concentration of drug. Absorbance was measured for 30 hours at 3ଌ using a BMG Optima platereader, maximum growth rate calculated using a curve-fitting script written in R, and the growth rate for each strain compared with that of the WT in the same plate, and growth in YPD/YPD +𠂝MSO.


2. Metode

This section proposes an expanded graph database model that includes the gene expression, miRNA expression, DNA methylation, copy number gain and loss information, tissue slide information, and mutation data from TCGA. It also outlines the steps performed to create the proposed graph database model.

2.1. Data

For this study, we have specifically added copy number information, miRNA expression, and image information of the tissue slide to the previously stored clinical information, gene expression (log2 counts per million), hyper and hypomethylation information, and mis-sense mutation data from the Genomics Data Commons (GDC) for breast cancer (BRCA), prostate adenocarcinoma (PRAD), and the pancreatic adenocarcinoma (PAAD). Table 1 shows the summary information about the data set used for this study.