Informasi

Rentang koefisien yang layak dalam model pertumbuhan tak terbatas

Rentang koefisien yang layak dalam model pertumbuhan tak terbatas


We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Jika Anda diberikan model pertumbuhan tak terbatas dalam bentuk:

$frac {dP(t)}{dt} = k P(t)$

Jelas pertumbuhan populasi tidak akan pernah tidak terbatas, tetapi mari kita asumsikan saat ini bahwa kita memperkenalkan spesies ke dalam lingkungan di mana ada kemungkinan pertumbuhan tak terbatas, setidaknya untuk waktu tertentu -- yaitu spesies invasif.

$k$ adalah beberapa tingkat pertumbuhan populasi pada waktu $t$, dilambangkan dengan $P(t)$

Apa sajakah nilai layak dari $k$? Dengan kata lain, jika angka jauh di atas atau jauh di bawah $k$, di mana saya akan tahu bahwa penelitian yang saya baca tidak masuk akal?

Saya yakin itu berbeda untuk berbagai jenis hewan, termasuk mamalia, burung, bakteri, dll.


Batas padat: k harus lebih besar dari nol. Kecuali jika Anda berbicara tentang beberapa spesies kanibalisme atau sesuatu yang tidak cocok untuk model sama sekali.

Selama spesies produktif di lingkungan baru: k lebih besar dari 1. Populasi mungkin tumbuh atau Anda mungkin tidak akan menggunakan model pertumbuhan eksponensial.

Seperti yang disebutkan sebelumnya, Anda perlu mengetahui spesiesnya untuk informasi lebih lanjut. Tetapi jika Anda melihat waktu generasi dan ukuran sampah:

  • Beberapa bakteri waktu generasi (dari sini) berkisar antara 10 hingga 2000 menit (33 jam). Jadi itu adalah $k=2$ per waktu generasi. Per hari Anda melihat batas bawah 2 per hari dan batas atas $k=2^{14}=10^{43}$ per hari.
  • tikus adalah sesuatu seperti waktu pembuatan 12 minggu dan 10 minggu memberikan sesuatu seperti $k=10^{10}$ per tahun.
  • gajah adalah satu muda setiap 25 tahun. Jadi $k=16$ per abad atau lebih.

Tentu saja ini semua berdasarkan asumsi kasar. Tetapi Anda sedang mencari panduan untuk model yang tidak realistis, jadi semoga mereka bisa melakukannya.


Produksi susu rumah tangga dan pertumbuhan anak: Bukti dari Bangladesh

Produksi/konsumsi susu sangat terkait dengan pertumbuhan anak di populasi Eropa dan Afrika.

Ini adalah studi pertama yang melihat produksi susu dan pertumbuhan anak di Asia.

Menggunakan data survei unik untuk mengeksploitasi variasi kuasi-eksperimental dalam paparan produksi susu.

Produksi susu meningkatkan skor HAZ anak sebesar 0,52 standar deviasi pada rentang usia 6-23 bulan.

Namun, produksi ASI dikaitkan dengan penurunan 20 poin dalam pemberian ASI eksklusif.


Latar belakang

Represi katabolit karbon (CCR) adalah mekanisme utama yang mengendalikan penyerapan karbohidrat pada bakteri, dan oleh karena itu juga mengendalikan apakah sumber karbon yang berbeda dimetabolisme secara paralel atau berurutan. Meskipun digambarkan sebagai paradigma regulasi metabolisme bakteri, mekanisme yang mendasarinya tetap kontroversial (lihat [1, 2]). Sistem menunjukkan tingkat kompleksitas yang tinggi yang terdiri dari metabolisme, ekspresi gen, dan pemrosesan sinyal. Contoh khas CCR adalah fenomena pertumbuhan diauxic (Gbr. 1).

Pertumbuhan diauxic sebagai manifestasi khas dari represi katabolit karbon pada bakteri (data eksperimental diambil dari [14]). Plot menunjukkan serapan berurutan glukosa (lingkaran biru) dan laktosa (kotak biru) dengan menumbuhkan Escherichia coli bakteri pada campuran sumber karbon. Hal ini menyebabkan akumulasi dua tahap biomassa (lingkaran merah) pada tingkat pertumbuhan yang tinggi (pada glukosa) dan tingkat pertumbuhan yang lebih rendah (pada laktosa) sampai semua sumber karbon habis.

Hipotesis yang berbeda mengenai fungsi dinamis dari sistem telah dieksplorasi oleh berbagai pendekatan pemodelan [2]. Tujuan dari penelitian ini adalah untuk membandingkan hipotesis ini dan kemampuannya untuk menangkap beberapa karakteristik utama CCR dalam kerangka pemodelan tunggal. Untuk tujuan ini, berdasarkan struktur model inti (sederhana) dengan hanya empat metabolit intraseluler, kami mengembangkan ansambel varian model, semuanya menunjukkan perilaku pertumbuhan diauxic selama budidaya batch dengan dua substrat. Varian model hanya berbeda dalam beberapa sifat struktural dan hanya memiliki sejumlah kecil parameter bebas. Penggunaan model kecil dengan beberapa parameter memungkinkan kita untuk fokus pada struktur jaringan yang mendasarinya ketika membandingkan varian model yang berbeda.

Pendekatan pemodelan ensemble telah digunakan untuk mengeksplorasi dan menganalisis struktur model yang berbeda dan/atau set parameter yang berbeda (lihat [3, 4] untuk contoh pemodelan ensemble). Seperti dapat dilihat pada Gambar. 2, kami memperluas cakupan pemodelan ensemble dalam penelitian ini dengan menambahkan dimensi lain. Alih-alih membatasi ansambel varian model menjadi model statis atau dinamis yang secara kuantitatif menggambarkan mekanisme asimilasi karbohidrat dan regulasinya, kami juga memperkenalkan varian model yang menutupi kekurangan informasi mekanistik dengan menggunakan program optimasi (linier dan nonlinier) yang berbeda, diterapkan baik secara statistik maupun dinamis. Perwakilan utama dari model berbasis optimasi tersebut adalah model keseimbangan fluks [5, 6].

Gambaran umum tentang strategi pemodelan ensemble yang digunakan dalam penelitian ini. Kami tidak hanya membedakan antara jenis persamaan model (statis atau dinamis), tetapi juga memperhitungkan model mekanistik. vs model berdasarkan optimasi (linier atau nonlinier). Sumbu vertikal mencerminkan meningkatnya kompleksitas program optimasi: masalah non-linier lebih sulit dipecahkan daripada program linier. Nol dari sumbu ini sesuai dengan model tanpa optimasi. Singkatan yang digunakan: AE (persamaan aljabar), ODE (persamaan diferensial biasa), FBA (analisis keseimbangan fluks), dFBA (analisis keseimbangan fluks dinamis)

Model-model dalam ansambel dapat dikategorikan menurut batasan regulasi, stoikiometrik, dan fisiologis dan berbeda satu sama lain hanya pada satu aspek. Kami membedakan empat kelompok model: (1) model keseimbangan fluks yang hanya menentukan kinetika reaksi untuk penyerapan substrat dan ekskresi produk sampingan, (2) model kinetik termasuk efek pengenceran pertumbuhan, (3) model kinetik dengan regulasi metabolisme dan / atau tingkat genetik, dan (4) model alokasi sumber daya. Gambar 3 memberikan gambaran umum tentang semua varian model yang dipertimbangkan di sini. Untuk mengukur keluaran model dan memungkinkan perbandingan varian model, indeks pertumbuhan diauxic D diperkenalkan, menunjukkan tingkat pemanfaatan berurutan dari dua sumber karbon.

Tinjauan umum semua varian model dalam ansambel dibagi menjadi empat kelompok: model berbasis kendala, model kinetik dengan pengenceran pertumbuhan, model kinetik dengan mekanisme pengaturan, dan model alokasi sumber daya

Untuk menilai lebih lanjut kinerja model, kami menganalisis dua kondisi eksperimental baru, yaitu pulsa glukosa yang diterapkan pada kultur yang tumbuh pada media minimal dengan laktosa dan kultur batch dengan kondisi awal yang tidak sama untuk sistem transportasi. Dalam kasus terakhir, substrat yang kurang disukai, laktosa, digunakan dalam pra-kultur dan oleh karena itu, masing-masing enzim berlimpah pada awal percobaan. Dengan membandingkan data eksperimen untuk kedua kondisi ini dengan prediksi model, sejumlah model dapat dikecualikan, sementara varian model lainnya masih tidak dapat didiskriminasi.

Berdasarkan analisis, kami menyimpulkan bahwa model termasuk mekanisme pengaturan yang diketahui seperti pengecualian penginduksi dan aktivasi ekspresi transporter dan enzim oleh faktor transkripsi global [1] paling mampu memperhitungkan eksperimen yang berbeda. Kemungkinan, bagaimanapun, bahwa penjelasan kuantitatif yang tepat dari kontrol penyerapan dan metabolisme karbohidrat melibatkan superposisi dari beberapa mekanisme molekuler yang berbeda, yang bekerja pada skala waktu yang berbeda. Kontribusi yang tepat dari setiap mekanisme individu selama stimulasi spesifik dari sistem masih harus ditentukan.


Pertumbuhan Populasi Logistik menurun pada daya dukung

Untuk mempertimbangkan bagaimana keterbatasan sumber daya mempengaruhi pertumbuhan penduduk, kita perlu memasukkan konsep daya dukung, ukuran populasi maksimum yang dapat dipertahankan oleh lingkungan. Setiap individu yang lahir dalam populasi ini akan meningkatkan ukuran populasi kecuali jumlah kematian seimbang atau melebihi jumlah kelahiran. Jika ukuran populasi tetap sama dari satu generasi ke generasi berikutnya, maka individu juga harus mati pada tingkat yang sama. Dengan pertumbuhan penduduk eksponensial, laju pertumbuhan penduduk R konstan, tetapi dengan penambahan daya dukung yang dipaksakan oleh lingkungan, laju pertumbuhan penduduk melambat seiring dengan bertambahnya ukuran populasi, dan pertumbuhan berhenti ketika populasi mencapai daya dukung.

Ketika sumber daya terbatas, populasi menunjukkan pertumbuhan logistik. Dalam pertumbuhan logistik, ekspansi populasi menurun karena sumber daya menjadi langka, dan akan menurun ketika daya dukung lingkungan tercapai, menghasilkan kurva berbentuk S. Sumber: OpenStax Biology

Secara matematis, kita dapat mencapai ini dengan memasukkan istilah yang bergantung pada kepadatan ke dalam persamaan pertumbuhan penduduk, di mana K mewakili daya dukung:

Sekarang, persamaan menunjukkan tingkat pertumbuhan penduduk R dimodifikasi oleh istilah yang bergantung pada kepadatan, (K–N)/K.

Apa yang terjadi pada pertumbuhan penduduk jika n relatif kecil terhadap K? Kapan n dekat K? Dan kapan populasi menambahkan individu paling banyak di setiap generasi?


Bahan dan metode

Hiu paus memiliki pola bintik dan garis pada tubuh yang unik untuk individu dan foto pola ini yang diambil oleh perenang snorkel atau penyelam dapat digunakan sebagai tanda pengenal dalam studi penangkapan kembali (Meekan et al., 2006 Holmberg et al., 2009) ). Kami menggabungkan pendekatan identifikasi foto ini dengan stereo-video dan fotogrametri (Sequeira et al., 2016) untuk menghitung ukuran tubuh individu dari waktu ke waktu. Penggunaan pendekatan non-invasif ini untuk memperkirakan tingkat pertumbuhan dapat dilakukan untuk hiu paus karena mereka membentuk agregasi yang dapat diprediksi di perairan pantai dangkal di daerah tropis dan subtropis yang hangat dan menunjukkan tingkat ketepatan lokasi yang tinggi untuk agregasi ini, dengan beberapa direkam secara sporadis selama beberapa periode. hingga 20 tahun (Meekan et al., 2006 Norman dan Morgan, 2016 Sequeira et al., 2016).

Pengumpulan data

Kami mengumpulkan pengukuran panjang tahunan hiu paus di Ningaloo Reef, Australia Barat (22끁�″ S, 113뀷�″ E), dari 2009 hingga 2019, selama puncak agregasi hiu paus tahunan (Taylor , 1996), biasanya pada minggu pertama bulan Mei setiap tahun (Gambar Tambahan 1). Setelah menemukan hiu (biasanya dengan survei udara), perenang snorkel memasuki air dan (i) mengambil foto identifikasi (ID) resolusi tinggi dari sayap di atas setiap sirip dada dari celah insang kelima hingga titik posterior sirip dada di kedua sisi (Speed ​​et al., 2007), (ii) menilai jenis kelamin dengan memeriksa ada tidaknya clasper, dan (iii) merekam rangkaian video seluruh tubuh menggunakan sistem video-stereo yang dioperasikan penyelam (DOVs) 1 untuk melakukan pengukuran Panjang tubuh. Semua pengamatan hiu paus menerima kode identifikasi individu. Kami menggunakan foto ID untuk mengidentifikasi penampakan berulang dan pengukuran hiu paus individu dalam dan antara tahun, berdasarkan pola bintik atau garis yang membedakan (Meekan et al., 2006). DOV berisi dua kamera Canon HFG25 (25 frame per detik, 1920 × 1080 piksel, bidang pandang lebar) atau kamera GoPro Hero 4 Black (30 bingkai per detik, resolusi 1920 × 1080 piksel, bidang pandang sedang) dipasang 𢏀.85 hingga 1 m terpisah pada sudut konvergensi ke dalam (𢏄°), dan dipasang di rumah khusus yang dirancang untuk menjaga stabilitas kalibrasi. Jarak pemisahan antara kamera yang digunakan di sini lebih besar daripada sistem konvensional yang dirancang untuk ikan bersirip karena jarak pemisahan yang lebih besar memungkinkan pengukuran target yang lebih akurat yang kemungkinan lebih jauh (Boutros et al., 2015). Sebelum setiap kunjungan lapangan, kalibrasi kamera dilakukan pada kotak kalibrasi besar (𢏂 × 2 m) yang ditempatkan di dasar kolam dan diukur pada jarak ϥ m untuk mencerminkan kemungkinan rentang target di bidang. Kalibrasi dilakukan menggunakan perangkat lunak CAL (lihat 𠇏ootnote 1”) dan akurasi kalibrasi diverifikasi dengan mengukur panjang yang diketahui pada bilah skala pasca-kalibrasi di kolam renang. Kami mengukur panjang garpu (FL 2009�) dan TL (2009�) individu hiu paus menggunakan perangkat lunak EventMeasure (lihat 𠇏ootnote 1”). Kami menemukan bahwa FL adalah pengukuran yang lebih konsisten dan andal, karena TL lebih rentan terhadap bias dari fleksi ekor. Namun, untuk meningkatkan komparabilitas dengan penelitian lain yang dipublikasikan tentang hiu paus, kami mengonversi perkiraan semua individu yang dilihat kembali ke TL menggunakan hubungan turunan antara FL-TL (dihitung menggunakan estimasi berpasangan 2009� di seluruh individu yang sesuai dengan model linier Gambar 1).

Gambar 1. Hubungan linier memprediksi panjang total dari panjang garpu untuk hiu paus, berdasarkan 3 tahun pengambilan sampel dan 124 pengukuran hiu berpasangan. Garis besar hiu paus digambar ulang dari Rohner et al. (2011).

Analisis Pertumbuhan

Kami memperkirakan profil pertumbuhan spesifik jenis kelamin untuk hiu paus menggunakan perbedaan ukuran antara pengamatan ulang dengan menyesuaikan model pertumbuhan von Bertalanffy (VBGM) untuk memperkirakan parameter L (rata-rata TL asimtotik dalam m) dan koefisien pertumbuhan K (yang menggambarkan kelengkungan pertumbuhan menuju L dalam satuan tahun 𠄱 ). Kami menggunakan formulasi VBGM yang diatur untuk menandai data (Fabens, 1965 Francis, 1988):

di mana L1 adalah TL pada pandangan pertama, ΔL adalah perbedaan TL antara penampakan pertama dan terakhir, dan ΔT adalah waktu bebas (dalam tahun desimal) antara penampakan pertama dan terakhir bagi seorang individu. Kami memecahkan variabel K dan L untuk setiap persamaan menggunakan estimasi kuadrat terkecil non-linier yang difasilitasi melalui nls() fungsi dalam R (Baty et al., 2015). Plot model pertumbuhan dibatasi (y-intercept) menjadi 0,6 m TL untuk mencerminkan perkiraan ukuran saat lahir (Joung et al., 1996) dan parameter pertumbuhan diperkirakan untuk jantan dan betina secara terpisah karena dominasi numerik jantan menghalangi interpretasi hasilnya sebagai profil pertumbuhan gabungan. Model dilengkapi dengan rentang parameter yang ditetapkan pada 0,0𠄰,5 tahun 𠄱 untuk K dan 6� m untuk L. Interval kepercayaan ditentukan melalui 10.000 iterasi resampling bootstrap. Untuk memeriksa apakah perbedaan dalam pertumbuhan spesifik jenis kelamin hanya disebabkan oleh rendahnya jumlah dan rentang pengamatan wanita, kami mengulangi model ini yang cocok untuk pria sementara hanya memasukkan individu yang cocok dengan rentang panjang awal (4𠄷 m TL) dan waktu bebas. (1𠄵 tahun) pengamatan perempuan.

Kami membandingkan kumpulan data lintasan pertumbuhan individu kami dengan perkiraan profil pertumbuhan hiu paus yang dipublikasikan (Wintner, 2000 Hsu et al., 2014 Perry et al., 2018 Ong et al., 2020). Untuk studi-studi yang memberikan perkiraan parameter pertumbuhan, kami memplot lintasan pertumbuhan individu hiu dari Ningaloo sebagai segmen garis yang dihamparkan pada masing-masing kurva pertumbuhan von Bertalanffy, di mana kombinasi panjang dan usia untuk pengamatan awal dianggap jatuh pada perkiraan usia-panjang yang diprediksi oleh model pertumbuhan. Kami melakukan latihan serupa untuk memeriksa kesesuaian antara perkiraan parameter pertumbuhan kami dan lintasan pertumbuhan yang diamati dari hiu paus di penangkaran (Museum Nasional Biologi dan Akuarium Kelautan, Akuarium Taiwan Okinawa Expo dan Akuarium Osaka Kaiyuken, Jepang) menggunakan data yang dipublikasikan dengan interval pertumbuhan berkisar dari ρ tahun hingga hampir dua dekade (lima jantan, dua betina, dan satu individu tidak berjenis kelamin, mulai dari 0,6 hingga 4,9 m TL pada ukuran awal Chang et al., 1997 Uchida et al., 2000 Wintner, 2000 Nishida, 2001 Matsumoto et al., 2019) dan juga data baru dari Akuarium Georgia (dua jantan, dua betina - mulai dari 4,1 hingga 4,7 m TL pada ukuran awal dan diamati selama lebih dari satu dekade).


Aplikasi

Inferensi Bayesian telah digunakan di semua bidang ilmu pengetahuan. Kami menjelaskan beberapa contoh di sini, meskipun ada banyak bidang aplikasi lain, seperti filsafat, farmakologi, ekonomi, fisika, ilmu politik, dan seterusnya.

Ilmu sosial dan perilaku

Tinjauan sistematis baru-baru ini yang meneliti penggunaan statistik Bayesian melaporkan bahwa ilmu sosial dan perilaku — psikologi, sosiologi, dan ilmu politik — telah mengalami peningkatan dalam karya Bayesian empiris 4 . Secara khusus, ada dua penggunaan paralel metode Bayesian yang semakin populer dalam ilmu sosial dan perilaku: pengembangan teori dan sebagai alat untuk estimasi model.

Aturan Bayes telah digunakan sebagai teori yang mendasari untuk memahami penalaran, pengambilan keputusan, kognisi dan teori pikiran, dan telah sangat lazim dalam psikologi perkembangan dan bidang terkait. Aturan Bayes digunakan sebagai kerangka kerja konseptual untuk perkembangan kognitif pada anak kecil, menangkap bagaimana anak-anak mengembangkan pemahaman tentang dunia di sekitar mereka 144 . Metodologi Bayesian juga telah dibahas dalam hal meningkatkan algoritma kognitif yang digunakan untuk pembelajaran. Gigerenzer dan Hoffrage 145 membahas penggunaan frekuensi, bukan probabilitas, sebagai metode untuk meningkatkan penalaran Bayesian. Dalam artikel penting lainnya, Slovic dan Lichtenstein 146 membahas bagaimana metode Bayesian dapat digunakan untuk proses penilaian dan pengambilan keputusan. Dalam bidang ilmu sosial dan perilaku ini, aturan Bayes telah digunakan sebagai alat konseptual penting untuk mengembangkan teori dan memahami proses perkembangan.

Ilmu sosial dan perilaku adalah pengaturan yang hebat untuk menerapkan inferensi Bayesian. Literatur kaya dengan informasi yang dapat digunakan untuk mendapatkan distribusi sebelumnya. Prior yang informatif berguna dalam situasi pemodelan yang kompleks, yang umum dalam ilmu sosial, serta dalam kasus ukuran sampel yang kecil. Model-model tertentu yang digunakan untuk mengeksplorasi hasil pendidikan dan tes standar, seperti beberapa model teori respons item multidimensi, tidak dapat dipecahkan menggunakan statistik frequentist dan memerlukan penggunaan metode Bayesian.

Jumlah publikasi tentang statistik Bayesian terus meningkat sejak 2004, dengan peningkatan yang lebih mencolok dalam dekade terakhir. Sebagian, fokus pada metode Bayesian ini karena pengembangan perangkat lunak yang lebih mudah diakses, serta fokus pada penerbitan tutorial yang menargetkan ilmuwan sosial dan perilaku terapan. Sebuah tinjauan sistematis metode Bayesian di bidang psikologi menemukan 740 artikel berbasis regresi yang memenuhi syarat menggunakan metode Bayesian. Dari jumlah tersebut, 100 artikel (13,5%) adalah tutorial untuk menerapkan metode Bayesian, dan 225 artikel tambahan (30,4%) adalah makalah teknis atau komentar tentang statistik Bayesian (Kotak 4). Metodologi telah mencoba untuk memandu peneliti terapan untuk menggunakan metode Bayesian dalam ilmu sosial dan perilaku, meskipun implementasinya relatif lambat untuk dipahami. Misalnya, tinjauan sistematis menemukan bahwa hanya 167 artikel Bayesian berbasis regresi (22,6%) yang aplikasinya menggunakan sampel manusia. Namun demikian, beberapa subbidang secara teratur menerbitkan karya yang mengimplementasikan metode Bayesian.

Bidang ini telah memperoleh banyak wawasan menarik tentang perilaku psikologis dan sosial melalui metode Bayesian, dan bidang substantif di mana pekerjaan ini telah dilakukan cukup beragam. Misalnya, statistik Bayesian telah membantu mengungkap peran penekanan keinginan dalam berhenti merokok 147 , untuk membuat perkiraan populasi berdasarkan pendapat ahli 148 , untuk memeriksa peran stres yang terkait dengan perawatan bayi dalam perceraian 149 , untuk menguji dampak dari ideologi Presiden AS tentang putusan Mahkamah Agung AS 150 dan untuk memprediksi perilaku yang membatasi asupan gula bebas dalam makanan 151 . Semua contoh ini mewakili cara yang berbeda di mana metodologi Bayesian ditangkap dalam literatur. Adalah umum untuk menemukan makalah yang menyoroti aturan Bayes sebagai mekanisme untuk menjelaskan teori perkembangan dan pemikiran kritis 144 , yaitu ekspositori 152.153 , yang berfokus pada bagaimana penalaran Bayesian dapat menginformasikan teori melalui penggunaan inferensi Bayesian 154 dan yang menggunakan pemodelan Bayesian untuk mengekstrak temuan-temuan yang akan sulit diperoleh dengan menggunakan metode frequentist 147. Secara keseluruhan, ada penggunaan luas aturan Bayes dalam ilmu sosial dan perilaku.

Kami berpendapat bahwa peningkatan penggunaan metode Bayesian dalam ilmu sosial dan perilaku adalah manfaat besar untuk meningkatkan pengetahuan substantif. Namun, kami juga merasa lapangan perlu terus mengembangkan standar penerapan dan pelaporan yang ketat agar hasilnya dapat direplikasi dan transparan. Kami percaya bahwa ada manfaat penting untuk menerapkan metode Bayesian dalam ilmu sosial, dan kami optimis bahwa fokus yang kuat pada standar pelaporan dapat membuat metode tersebut berguna secara optimal untuk memperoleh pengetahuan substantif.

Kotak 4 Metode Bayesian dalam ilmu sosial dan perilaku

Hoijtink dkk. 255 membahas penggunaan faktor Bayes untuk hipotesis informatif dalam penilaian diagnostik kognitif, menggambarkan bagaimana evaluasi Bayesian hipotesis diagnostik informatif dapat digunakan sebagai pendekatan alternatif untuk metode diagnostik tradisional. Ada tambahan fleksibilitas dengan pendekatan Bayesian karena hipotesis diagnostik informatif dapat dievaluasi menggunakan faktor Bayes yang hanya menggunakan data dari individu yang didiagnosis. Lee 154 menyajikan gambaran penerapan teorema Bayes dalam bidang psikologi kognitif, membahas bagaimana metode Bayesian dapat digunakan untuk mengembangkan teori psikologi kognitif yang lebih lengkap. Metode Bayesian juga dapat menjelaskan perilaku yang diamati dalam hal proses kognitif yang berbeda, menjelaskan perilaku pada berbagai tugas kognitif dan memberikan penyatuan konseptual model kognitif yang berbeda. Depaoli dkk. 152 menunjukkan bagaimana metode Bayesian dapat bermanfaat bagi penelitian berbasis kesehatan yang dilakukan dalam psikologi dengan menyoroti bagaimana prioritas informatif yang diperoleh dengan pengetahuan ahli dan penelitian sebelumnya dapat digunakan untuk lebih memahami dampak fisiologis dari stresor berbasis kesehatan. Dalam skenario penelitian ini, metode frequentist tidak akan menghasilkan hasil yang layak karena ukuran sampel relatif kecil untuk model yang diperkirakan karena biaya pengumpulan data dan populasi sulit diakses untuk pengambilan sampel. Akhirnya, Kruschke 153 menyajikan contoh paling sederhana menggunakan a T-tes diarahkan psikolog eksperimental, menunjukkan bagaimana metode Bayesian dapat menguntungkan interpretasi parameter model apapun. Makalah ini menyoroti cara Bayesian dalam menafsirkan hasil, dengan fokus pada interpretasi seluruh posterior daripada perkiraan titik.

Ekologi

Penerapan analisis Bayesian untuk menjawab pertanyaan ekologi telah menjadi semakin luas karena kedua argumen filosofis, terutama dalam hal penalaran subjektif versus objektif, dan keuntungan model-fitting praktis. Ini dikombinasikan dengan perangkat lunak yang tersedia (Tabel 2) dan banyak publikasi yang menjelaskan aplikasi ekologi Bayesian menggunakan paket perangkat lunak ini (lihat referensi 155.156.157.158.159.160.161 untuk contoh). Filosofi Bayesian yang mendasari menarik dalam banyak hal dalam ekologi 162 karena memungkinkan penggabungan informasi sebelumnya independen eksternal baik dari studi sebelumnya pada spesies yang sama / serupa atau pengetahuan yang melekat dari proses biologis dalam kerangka yang ketat 163.164 . Selanjutnya, pendekatan Bayesian juga memungkinkan pernyataan probabilistik langsung dibuat pada parameter yang diinginkan, seperti probabilitas kelangsungan hidup, tingkat reproduksi, ukuran populasi dan prediksi masa depan 157 , dan perhitungan probabilitas relatif dari model yang bersaing — seperti ada atau tidak adanya ketergantungan kepadatan atau faktor lingkungan dalam mendorong dinamika ekosistem — yang pada gilirannya memungkinkan perkiraan rata-rata model yang menggabungkan ketidakpastian parameter dan model. Kemampuan untuk memberikan pernyataan probabilistik sangat berguna dalam kaitannya dengan pengelolaan dan konservasi satwa liar. Misalnya, Raja et al. 165 memberikan pernyataan probabilitas dalam kaitannya dengan tingkat penurunan populasi selama periode waktu tertentu, yang pada gilirannya memberikan probabilitas yang terkait dengan status konservasi spesies.

Pendekatan Bayesian juga sering diterapkan dalam penelitian ekologi karena alasan pragmatis. Banyak model ekologi yang kompleks — misalnya, mungkin bersifat spatio-temporal, berdimensi tinggi dan/atau melibatkan banyak proses biologis yang berinteraksi — yang mengarah pada kemungkinan mahal secara komputasi yang lambat untuk dievaluasi. Proses pengumpulan data yang tidak sempurna atau terbatas sering kali menyebabkan hilangnya data dan kemungkinan terkait yang sulit dipecahkan. Dalam keadaan seperti itu, alat pemasangan model Bayesian standar seperti augmentasi data dapat mengizinkan model untuk dipasang, sedangkan dalam kerangka kerja sering alternatif, penyederhanaan atau perkiraan model tambahan mungkin diperlukan. Penerapan statistik Bayesian dalam ekologi sangat luas dan mencakup rentang skala spatio-temporal dari tingkat organisme individu ke tingkat ekosistem yang mencakup pemahaman dinamika populasi dari sistem yang diberikan 166 , pemodelan data pola titik spasial 167 , menyelidiki genetika populasi, memperkirakan kelimpahan 168 dan menilai pengelolaan konservasi 169 .

Proses pengumpulan data ekologi umumnya berasal dari studi observasional, di mana sampel diamati dari populasi yang diinginkan dengan menggunakan beberapa protokol survei data. Survei harus dirancang dengan hati-hati, dengan mempertimbangkan pertanyaan ekologi yang menarik dan meminimalkan kompleksitas model yang diperlukan agar sesuai dengan data untuk memberikan inferensi yang andal. Namun demikian, tantangan pemasangan model terkait mungkin masih muncul karena masalah pengumpulan data, seperti yang diakibatkan oleh kegagalan peralatan atau kondisi cuaca yang buruk. Mungkin juga ada masalah pengumpulan data yang melekat dalam beberapa survei data, seperti ketidakmampuan untuk merekam informasi tingkat individu. Tantangan pemasangan model seperti itu mungkin termasuk — tetapi jauh dari terbatas pada — pengamatan dengan jarak yang tidak teratur dalam waktu karena kegagalan peralatan atau desain eksperimental, kesalahan pengukuran karena pengamatan data yang tidak sempurna, informasi yang hilang pada berbagai tingkat yang berbeda, dari tingkat individu hingga tingkat lingkungan global, dan tantangan yang terkait dengan studi multiskala di mana berbagai aspek data dicatat pada skala temporal yang berbeda — misalnya, dari data lokasi per jam individu hingga pengumpulan data lingkungan harian dan bulanan. Kompleksitas data yang muncul, dikombinasikan dengan pilihan pemodelan terkait, dapat menyebabkan berbagai tantangan model-fitting yang sering dapat diatasi dengan menggunakan teknik standar dalam paradigma Bayesian.

Untuk studi ekologi tertentu, memisahkan proses individu yang bekerja pada ekosistem merupakan mekanisme yang menarik untuk menyederhanakan spesifikasi model 166 . Misalnya, model ruang keadaan menyediakan kerangka pemodelan umum dan fleksibel yang menggambarkan dua jenis proses yang berbeda: proses sistem dan proses pengamatan. Proses sistem menggambarkan keadaan sebenarnya yang mendasari sistem dan bagaimana ini berubah dari waktu ke waktu. Keadaan ini mungkin univariat atau multivariat, seperti ukuran populasi atau data lokasi, masing-masing. Proses sistem juga dapat menggambarkan beberapa proses yang bekerja pada sistem, seperti kelahiran, reproduksi, penyebaran, dan kematian. Kami biasanya tidak dapat mengamati keadaan sistem dasar yang sebenarnya ini tanpa beberapa kesalahan terkait dan proses pengamatan menjelaskan bagaimana data yang diamati berhubungan dengan keadaan yang tidak diketahui yang sebenarnya. Model ruang keadaan umum ini mencakup banyak aplikasi, termasuk pergerakan hewan 170, data jumlah populasi 171 , data tipe penangkapan-penangkapan 165 , penilaian stok perikanan 172 dan keanekaragaman hayati 173 . Untuk tinjauan topik ini dan aplikasi lebih lanjut, kami mengarahkan pembaca ke tempat lain 166.174.175 . Alat pemasangan model Bayesian, seperti MCMC dengan augmentasi data 176 , Monte Carlo sekuensial atau partikel MCMC 177.178.179 , mengizinkan model ruang keadaan umum untuk dipasang ke data yang diamati tanpa perlu menentukan batasan lebih lanjut — seperti asumsi distribusi — pada model spesifikasi, atau untuk membuat perkiraan kemungkinan tambahan.

Proses pengumpulan data terus berkembang seiring dengan kemajuan teknologi. Misalnya, penggunaan tag geolokasi GPS dan akselerometer tambahan terkait, penginderaan jauh, penggunaan drone untuk foto udara lokal, kendaraan bawah air tak berawak dan jebakan kamera sensor gerak semakin banyak digunakan dalam penelitian ekologi. Penggunaan perangkat teknologi ini dan pertumbuhan sains crowdsourced telah menghasilkan bentuk data baru yang dikumpulkan dalam jumlah besar dan tantangan penyesuaian model terkait, menyediakan lahan subur untuk analisis Bayesian.

Genetika

Studi genetika dan genomik telah menggunakan metode Bayesian secara ekstensif. Dalam studi asosiasi genom-lebar, pendekatan Bayesian telah memberikan alternatif yang kuat untuk pendekatan frequentist untuk menilai hubungan antara varian genetik dan fenotipe yang menarik dalam suatu populasi 180 . Ini termasuk model statistik untuk menggabungkan pencampuran genetik 181 , pemetaan halus untuk mengidentifikasi varian genetik kausal 182 , imputasi penanda genetik yang tidak diukur secara langsung menggunakan populasi referensi 183 dan meta-analisis untuk menggabungkan informasi lintas studi. Aplikasi ini lebih lanjut mendapat manfaat dari penggunaan marginalisasi untuk memperhitungkan ketidakpastian pemodelan saat menarik kesimpulan. Baru-baru ini, studi kohort besar seperti UK Biobank 184 telah memperluas persyaratan metodologis untuk mengidentifikasi asosiasi genetik dengan (sub) fenotipe kompleks dengan mengumpulkan informasi genetik di samping kumpulan data heterogen termasuk pencitraan, gaya hidup, dan data kesehatan yang dikumpulkan secara rutin. Kerangka analisis Bayesian yang dikenal sebagai TreeWAS 185 telah memperluas metode asosiasi genetik untuk memungkinkan penggabungan klasifikasi diagnosis penyakit terstruktur pohon dengan memodelkan struktur korelasi efek genetik di seluruh fenotipe klinis yang diamati. Pendekatan ini menggabungkan pengetahuan sebelumnya tentang hubungan fenotipe yang dapat diturunkan dari pohon klasifikasi diagnosis, seperti informasi dari versi terbaru International Classification of Diseases (ICD-10).

Ketersediaan beberapa tipe data molekuler dalam kumpulan data multi-omics juga telah menarik solusi Bayesian untuk masalah integrasi data multimodal. Model variabel laten Bayesian dapat digunakan sebagai pendekatan pembelajaran tanpa pengawasan untuk mengidentifikasi struktur laten yang sesuai dengan proses biologis yang diketahui atau sebelumnya tidak dikarakterisasi pada skala molekul yang berbeda. Multi‐omics factor analysis 186 uses a Bayesian linear factor model to disentangle sources of heterogeneity that are common across multiple data modalities from those patterns that are specific to only a single data modality.

In recent years, high-throughput molecular profiling technologies have advanced to allow the routine multi-omics analysis of individual cells 187 . This has led to the development of many novel approaches for modelling single-cell measurement noise, cell-to-cell heterogeneity, high dimensionality, large sample sizes and interventional effects from, for example, genome editing 188 . Cellular heterogeneity lends itself naturally to Bayesian hierarchical modelling and formal uncertainty propagation and quantification owing to the layers of variability induced by tissue-specific activity, heterogeneous cellular phenotypes within a given tissue and stochastic molecular expression at the level of the single cell. In the integrated Bayesian hierarchical model BASiCS 189 , this approach is used to account for cell-specific normalization constants and technical variability to decompose total gene expression variability into technical and biological components.

Deep neural networks (DNNs) have also been utilized to specify flexible, non-linear conditional dependencies within hierarchical models for single-cell omics. SAVER-X 190 couples a Bayesian hierarchical model with a pretrainable deep autoencoder to extract transferable gene–gene relationships across data sets from different laboratories, variable experimental conditions and divergent species to de-noise novel target data sets. In scVI 191 , hierarchical modelling is used to pool information across similar cells and genes to learn models of the distribution of observed expression values. Both SAVER-X and scVI perform approximate Bayesian inference using mini-batch stochastic gradient descent, the latter within a variational setting — a standard technique in DNNs — that allow these models to be fitted to hundreds of thousands or even millions of cells.

Bayesian approaches have also been popular in large-scale cancer genomic data sets 192 and have enabled a data-driven approach to identifying novel molecular changes that drive cancer initiation and progression. Bayesian network models 193 have been developed to identify the interactions between mutated genes and capture mutational signatures that highlight key genetic interactions with the potential to allow for genomic-based patient stratification in both clinical trials and the personalized use of therapeutics. Bayesian methods have also been important in answering questions about evolutionary processes in cancer. Several Bayesian approaches for phylogenetic analysis of heterogeneous cancers enable the identification of the distinct subpopulations that can exist with tumours and their ancestral relationships through the analysis of single-cell and bulk tissue-sequencing data 194 . These models therefore consider the joint problem of learning a mixture model and graph inference through considering the number and identity of the subpopulations and deriving the phylogenetic tree.


1. An Introduction to Structural Equation Modeling

Broadly, structural equation modeling (SEM) unites a suite of variables in a single network. They are generally presented using box-and-arrow diagrams denoting directed (causal) relationships among variables:

Those variables that exist only as predictors in the network are referred to as exogenous, and those that are predicted (at any point) as endogenous. Exogenous variables therefore only ever have arrows coming out of them, while endogenous arrows have arrows coming into them (which does not preclude them from having arrows come out of them as well). This vocabulary is important when considering some special cases later.

In traditional SEM, the relationships among variables (i.e., their linear coefficients) are estimated simultaneously in a single variance-covariance matrix. This approach is well developed but can be computationally intensive (depending on the sizes of the v-cov matrix) and additionally assumes independence and normality of errors, two assumptions that are generally violated in ecological research.

Piecewise structural equation modeling (SEM), also called confirmatory path analysis, was proposed in the early 2000s by Bill Shipley as an alternate approach to traditional variance-covariance based SEM. In piecewise SEM, each set of relationships is estimated independently (or locally). This process decomposes the network into the corresponding simple or multiple linear regressions for each response, each of which are evaluated separately, and then combined later to generate inferences about the entire SEM. This approach has two consequences: 1. Increasingly large networks can be estimated with ease compared to a single vcov matrix (because the approach is modularized), and 2. Specific assumptions about the distribution and covariance of the responses can be addressed using typical extensions of linear regression, such as fixed covariance structures, random effects, and other sophisticated modeling techniques.

Unlike traditional SEM, which uses a (chi^2) test to compare the observed and predicted covariance matrices, the goodness-of-fit of a piecewise structural equation model is obtained using ‘tests of directed separation.’ These tests evaluate the assumption that the specific causal structure reflects the data. This is accomplished by deriving the ‘basis set,’ which is the smallest set of independence claims obtained from the SEM. These claims are relationships that are unspecified in the model, in other words paths that could have been included but were omitted because they were deemed to be biologically or mechanistically insignificant. The tests ask whether these relationships can truly be considered independent (i.e., their association is not statistically significant within some threshold of acceptable error, typically (alpha) =0.05) or whether some causal relationship may exist as indicated by the data.

For instance, the preceding example SEM contains 4 specified paths (solid, black) and 2 unspecified paths (dashed, red), the latter of which constitute the basis set:

In this case, there are two relationships that need to be evaluated: y3 and x1 , and y3 and y2 . However, there are additional influences on y3 , specifically the directed path from y2 . Thus, the claims need to be evaluated for ‘conditional independence,’ i.e. that the two variables are independent bersyarat on the already specified influences on both of them. This also pertains to the predictors of y2 , including the potential contributions of x1 . So the full claim would be: y2 | y3 (y1, x1) , with the claim of interest separated by the | bar and the conditioning variable(s) following in parentheses.

As the network grows more complex, however, the independence claims only consider variables that are immediately ancestral to the primary claim (i.e., the parent nodes). For example, if there was another variable predicting x1 , it would not be considered in the independence claim between y3 and y2 since it is >1 node away in the network.

The independence claims are evaluated by fitting a regression between the two variables of interest with any conditioning variables included as covariates. Thus, the claim above y2 | y3 (y1, x1) would be modeled as y3

y2 + y1 + x1 . These regressions are constructed using the same assumptions about y3 as specified in the actual structural equation model. So, for instance, if y3 is a hierarchically sampled variable predicted by y1 , then same hierarchical structure would carry over to the test of directed separation of y3 predicted by y2 .

The P-values of the conditional independence tests are then combined in a single Fisher’s C statistic using the following equation:

This statistic is (chi^2) -distributed with 2k degrees of freedom, with k being the number of independence claims in the basis set.

Shipley (2013) also showed that the the C statistic can be used to compute an AIC score for the SEM, so that nested comparisons can be made in a model selection framework:

where K is the likelihood degrees of freedom. A further variant, (AIC_c) , can be obtained by adding an additional penalty based on sample size:

The piecewiseSEM package automates the derivation of the basis set and the tests of directed separation, as well as extraction of path coefficients based on the user-specified input.


Metode

Building the model

The list of reactions

As a starting point, the S. cerevisiae bibliome was searched for references related to a list of genes known to be involved in iron homeostasis. The bibliome consist of all articles concerning S. cerevisiae found on PubMed.

From these articles, a list of reactions was manually inferred and selected on the basis of our knowledge.

The new genes cited in these articles that we felt to be critical for model accuracy were then used to direct a new PubMed search. This process was repeated until we were unable to identify any new reactions strictly related to iron homeostasis.

The same process was used for inorganic phosphate homeostasis, for some oxidative stress reactions and for any other processes that we felt to be important for our model that were not described in databases.

We searched the Swissprot database for a list of proteins requiring iron-sulphur clusters or haem as a cofactor. We then selected metabolic pathways involving these proteins that could be directly or indirectly linked to iron homeostasis or oxidative stress. The reactions describing these pathways were expressed according to SGD pathways [68] on the Saccharomyces Genome Database website http://www.yeastgenome.org.

However, when including a given pathway, we did not systematically describe all its steps. If a pathway included several steps producing intermediate metabolites not required by any other pathway included in our model, we wrote the whole pathway as one reaction, unless we had reasons to include the intermediate reactions. For example, siroheme biosynthesis from uroporphyrinogen-III involves three reactions. The two intermediate metabolites produced, precorrin-2 and sirohydrochlorin, are not required by any other reaction already included in the model, so we could have expressed the whole process as one reaction. However, the first step involves S-adenosyl-methionine and S-adenosyl-homocysteine, which are already included in several reactions in our model, and the last step involves iron. Thus, if siroheme cannot be produced - in a mutant for instance - we want to be able to determine whether this deficiency is related to 2-S-adenosyl-methionine synthesis or iron availability. We therefore included siroheme biosynthesis as two reactions, the first producing precorrin-2 from uroporphyrinogen-III and 2-S-adenosyl-methionine and the second producing siroheme from precorrin-2 and NAD.

Finally, we searched the yeast metabolome (described in SGD pathways) for reactions that might link several metabolites already included in our model. This is the reason for which we included alanine degradation, for example, in the model.

The weights of the reactions

The default value for the weight of the reactions was one. However some reactions were given a weight lower than one (for most degradation reactions, the weight is 0.01), or higher than one. For example, the reaction catalysed by the Sod1 superoxide dismutase was given a weight of 100, to take into account the extremely high abundance of this protein in yeast cells (519,000 molecules per cell [69]), its very high catalytic efficiency and turnover number. A list of reactions given weights other than one is provided in Table 2. We simulated our model with all weights set to 1, and it was unable to produce realistic outputs (data not shown). For example, the hydroxyl radical elements have PoP lower than 1% in the complete model, whereas the PoP is higher than 70% in the model without weights, which is not realistic (the parameter PoP is defined below, in the Simulations subsection). The WT model corresponds to a model where the outputs are biologically meaningfull according the data of the literature and our own experience. We performed a sensitivity analysis of the outputs of the model (PoP at steady state) when the weights of the reactions are modified. Each weight was multiplied by a coefficient k. The differences between each PoP of the initial model and the PoP of the model with the modified weight were computed. Our results show that the model is robust to modifications of the weights when the coefficient k ranges from 0.1 to 10. See Additional file 2 for more details.

Simulations

Algorithm

We used a modified version of the Biocham asynchronous Boolean simulation algorithm, which can be summarised as follows:

Initial state: the list of elements that are "ON" at the beginning of the simulation.

Based on the list of elements that are "ON", the list of possible reactions is inferred: A reaction is possible if its reactants and modifiers are "ON".

A reaction is randomly selected from the list of possible reactions

The products of the selected reaction are set to "ON".

The reactants are randomly either set to "OFF" or left "ON".

A new list of elements that are "ON" is computed.

Steps 2 to 6 are repeated for each simulation step.

In the Biocham algorithm, the reactants are not always set to "OFF", so it is possible to reselect the reaction in subsequent steps (see above step 5). This possibility reflects the presence of more than one molecule of each sort in biological systems. If the same reaction is selected over and over again, all the reactants will eventually be turned "OFF" and the reaction will cease to be possible unless other reactions set the reactants back to "ON". Indeed, in biological systems, all elements may be considered to exist in limited quantities.

We also had to overcome a limitation of Biocham to obtain simulations as close as possible to real biological systems: this algorithm does not mimic the large differences in reactions rates observed in real biological systems. Although we do not precisely know the rate of all the reactions in the model, we can reasonably state that the degradation of an enzyme is far less likely to occur than the reaction this enzyme catalyzes. Another example is related to the functioning of the TCA cycle: we know that some reactions of the cycle are technically reversible but the reaction always runs in one direction in practice, because this direction is thermodynamically more favorable. We needed to mimic this situation, to make our model as realistic as possible. We therefore decided to extend the algorithm, by weighting the reactions. From the weights of the possible reaction, at step 2, in the previous algorithm, we calculated the probability of a reaction being selected as the weight of the reaction divided by the sum of the weights of all possible reactions. A reaction is therefore more likely to occur if it has a high weighting.

If the experimental rates of all reactions in the model were known, we could set their weights accordingly. As these rates were not all known, we defined relative weights, using a default rate of one. The weight of a given reaction was modified if we had quantitative or qualitative experimental knowledge relating to its reaction rate (see Table 2 for a list of reactions with weights other than one).

In this paper, all the elements were set to "ON" at the beginning of the simulations, defining thus the "initial conditions". We tested different initial conditions (for the non constant elements only) which always resulted in the same steady-state (data not shown).

Mean profiles

The simulation algorithm was developed in the C (for computation) and Python (for file/data management) languages on a Linux workstation. The simulations were performed on a cluster of 40 nodes (DualCore AMD 64 bits Opteron bi-Processor, 2Go RAM, PBS/Maui scheduler). Each type of model was simulated 100 times, until a steady state was reached (see below). From these 100 simulations and for each element, we defined the PoP as the percentage of simulations in which the element was "ON" (present), for each simulation step. This calculation process is referred to as the profile of an element. Figure 4 shows examples of such profiles in which the mean PoP was also calculated every 100,000 steps.

Steady states

We checked whether steady state had been reached, by performing an ANOVA on the last 2 million steps, for every element (the null hypothesis is that the PoPs are independent from the iterations). If a significant result was obtained (p-value below 0.001), then the simulation was rerun for a larger number of steps. As background noise in the simulation can generate false positives, if the means of the last million steps of the new simulation of each element was equal to the means of the previous simulations, then we considered steady state to have been reached.

For some mutants, some elements have PoPs that increase or decrease slowly. For example, in some mutants the reactive oxygen species have increasing PoP. Most of them will have a PoP of 100 at steady-state but some will reached this maximum value faster than others. We advocate that comparing the simulations of the mutants after the same number of steps may be related to some biological properties of the mutants. To be able to take into consideration these differences in our systematic dalam silikon mutants analysis, we used the PoP of the elements after 20 millions of steps even though the simulations were not at steady-state yet.

74% of the simulations reached steady-state before 20 million steps and among the remaining 26% only some very slowly increasing or decreasing elements have not reached steady-state yet.

In silico mutations and mutants

Model simulations began with all elements "ON" and continued until steady state was reached: this situation corresponds to "wild-type" simulations. To simulate a mutant for a given gene, a wild-type simulation was carried out for one million of steps. The gene in question was then turned "OFF" and the simulation was continued for 19 more millions of steps for the systematic mutants analysis or until steady-state was reached for the other analysis. This method for simulating mutants mimics experimental Tet-OFF mutants, in which the transcription of a given gene is controlled by a tetracycline- responsive promoter and can be turned off by adding tetracycline to the growth medium [70].

The model was simulated according to this procedure with, for each set of simulations, one of the unlimited elements deleted. Each type of model in which an unlimited element was turned "OFF" was referred to as a "mutant". Note that none of the reactions were modified.

Kekelompokan

For each "mutant" model, the mean of the last million of steps of the simulation was calculated for each element. Then, for each element and for each "mutant", a distance to the "wild-type" was calculated as follows: (formula based on A.Ultsch RelDi [71]). In the resulting matrix, each column is an element of the model and each row a "mutant". This matrix was then clustered, using [email protected] (real location data type, relative error = 0.01), which identifies classes of "mutants" causing similar changes in the simulations. The matrix was then transposed and clustered again: this generated classes of elements changing in a similar fashion in the different "mutants". Figure 6 shows the matrix clustered for both elements and "mutants".

Figure 6 is annotated: the rows are annotated with the most significantly enriched gene ontology process (calculated with GoTermFinder [72]), the columns were annotated manually (only the columns containing many elements which PoP changes significantly were annotated - see Additional file 6 for the complete clustering).

Phenotypes analysis

The file containing the manually curated phenotypes of the yeast mutants was retrieved from SGD (file edited the 03/13/10). From this file, we extracted the phenotypes associated with mutants of genes present in the model. From this selection, we further extracted the phenotypes that could be compared with our simulation results (e.g. the auxotrophies related to molecules present in the model). Six types of phenotypes were extracted: "auxotrophies" (cysteine, methionine, heme, glutamate), "chemical compound accumulation" (of elements present in the model), "oxidative stress resistance", "respiratory growth" and "nitrogen source". Then, we compared these phenotypes with the PoP of the corresponding elements in the WT model and in the mutants. For the "oxidative stress resistance", we did not simulate the mutants with an additional source of stress, as where observed the experimental phenotypes. Instead, we looked for the production of ROS in the mutant as compared to the WT in standard simulations. Therefore, only the constitutively stressed mutants showed similar phenotypes in our simulations and in vivo. As for the "respiratory growth" phenotypes, we compared the PoP of the protons in the intermembrane space (noted "Hinter::mitochondria" in the model), because in the model an increase or a decrease in this element PoP can be directly linked to a defect in the respiratory metabolism. For the "nitrogen source utilization" phenotypes, we compared the PoP of the products of the utilization of the nitrogen source. See Additional file 3 for the full results.


Pengantar

In economics, many processes depend on past events, so it is natural to use time-delay differential equations to model economic phenomena. Two main areas of applications are business cycle and economic growth theories. In recent decades, the analysis of the effect of investment delay has been the focus of extensive examination as a tool for endogenous cycles to explain business cycles and growth cycles. Differential equations with time delay (discrete or distributed) and their mathematical methods have been seen to be the most adequate tools to model the business cycle and growth in an economy where the investment delay plays a crucial role [1,2,3,4,5], as well as in physics, finance and biology [6,7,8,9].

The mechanism of the supercritical Hopf bifurcation leading to a stable limit cycle is a well-known route to the self-sustainable cyclic behavior. It can be employed for both ordinary differential equations and delay differential equations. Many examples of its use can be found in economics [10,11,12,13,14] and in other sciences [15,16,17].

One of the most influential models of business cycle with time delay is the Kaldor–Kalecki model [18], which is based on the Kaldor model, one of the earliest endogenous business cycle models [19,20,21]. The Kaldor is a prototype of a dynamical system with cyclic behavior in which nonlinearity plays a crucial role in generating endogenous cycles. The nonlinearities are a common feature used to model the complexity of economic systems [22]. In turn, the investment delay was assumed to be the average time of making investment as it was proposed by Kalecki [23].

The investment decisions are taken given the current state of economy. These past investment decisions lead to the change of capital stock in a present economy and may cause fluctuations in economic variables. This kind of time delay, i.e., the time required for building new capital, is an intrinsic (response) type of time delay, which could be also found in neuron due to the autapse connection [24]. Time-delay systems with both response and propagation time delay were studied in many domains of science [25].

The Kaldor–Kalecki business cycle model has been the topic of several studies as well as augmentations. One of these was to incorporate an exponential trend to describe growth of an economy [26]. This new Kaldor–Kalecki growth model was formulated in a similar manner the Kaldor growth model was developed from the Kaldor business cycle model [27].

The Kaldor–Kalecki model has been extensively studied. While mostly the discrete delay was investigated, some Kaldor–Kalecki models with distributed delays were also proposed. The Kaldor–Kalecki models with fixed delay include both models with one delay and two delays [28,29,30,31,32].

In the existing literature, time delays can be modeled by assuming either fixed time lags or continuously distributed time lags (distributed delay henceforth). The former refers to economic circumstances where there is a set amount of time gap, institutionally or socially defined, for the agents concerned. The latter is suitable for economic situations where different lengths of delays are distributed across the various agents. A major difficulty is that time delays are not known exactly. On the other hand, distributed delays are based on the weighted average of all past data from time zero up to the current time period. Thus, distributed delays provide a more realistic description of economic systems with time delay. There is also some experimental evidence which indicates that they are more accurate than those with instantaneous time lags (see [33]). Cushing [34] introduced and used distributed delays in mathematical biology, while Invernizzi and Medio [35] presented distributed delays into mathematical economics. Some examples in context of economic growth are provided in [36] and [37].

In [38], we proposed an economic growth model where the average time of investment completion is replaced by a distributed time length of investment. The gamma distribution function for the investment delay is assumed. This allows to consider different time lengths of investment accomplishment. The resulting model is described by a dynamical system with a distributed time delay.

While the delay differential equation methods are developing rapidly, the mathematical methods for ordinary differential equations are superlative, especially when distributed delays are considered. Therefore, it is convenient to approximate systems with distributed delays with those of ordinary differential equations. One way to do it is provided by the so-called linear chain trick [39,40,41]. Consequently, an infinite-dimensional dynamical system is approximated by a finite-dimensional dynamical system, where the dimension of the system can be chosen. For an example of this method applied to delayed chemical reaction networks, see [42]. We note that another way to approximate a delay differential equation system with a ordinary differential equation system is via the Padé approximation [43, 44].

The main aim of this paper is to study the emergence of a bifurcation due to the change of the parameter values in the approximated Kaldor–Kalecki growth model. We consider two simplest cases of three- and four-dimensional dynamical systems obtained through the linear chain trick from the Kaldor–Kalecki growth model with the distributed delay, corresponding to the weak and strong kernels, respectively. For both models, we establish conditions for the existence of Hopf bifurcation with respect to the time-delay parameter and the rate of growth parameter. It is found that both parameters play role in a scenario leading to the Hopf bifurcation and arising cyclic behavior.

In the numerical part of this paper, we determine in detail the ranges of parameter values for which cyclical behavior is possible. In this analysis, we use the investment function obtained by Dana and Malgrange for the French economy [27]. As in the theoretical part of the paper, we choose the time-delay parameter and the rate of growth parameter, as well as the adjustment parameter, for the bifurcation investigations. It is shown how some combinations of these three-parameter values can trigger the cycles through the Hopf bifurcation mechanism. In the three-parameter space of the model, we were able to obtain the surface (a section of a paraboloid) separating the regions with stable and cyclic solutions.


Selective Phenome Growth Adapted

Aptamers are single-stranded oligonucleotides selected by evolutionary approaches from massive libraries with significant potential for specific molecular recognition in diagnostics and therapeutics. A complete empirical characterisation of an aptamer selection experiment is not feasible due to the vast complexity of aptamer selection. Simulation of aptamer selection has been used to characterise and optimise the selection process however, the absence of a good model for aptamer-target binding limits this field of study. Here, we generate theoretical fitness landscapes which appear to more accurately represent aptamer-target binding. The method used to generate these landscapes, selective phenome growth, is a new approach in which phenotypic contributors are added to a genotype/phenotype interaction map sequentially in such a way so as to increase the fitness of a selected fit sequence. In this way, a landscape is built around the selected fittest sequences. Comparison to empirical aptamer microarray data shows that our theoretical fitness landscapes more accurately represent aptamer ligand binding than other theoretical models. These improved fitness landscapes have potential for the computational analysis and optimisation of other complex systems.

1. Perkenalan

1.1. Latar belakang

Aptamers are single-stranded nucleic acid sequences capable of specific high-affinity binding [1–4]. This makes them attractive candidates as recognition molecules in diagnostics and therapeutics. Aptamers are isolated by systematic evolution of ligands by conventional exponential enrichment (SELEX), which involves several iterative steps of incubation with target, washing away of weak binders, and PCR amplification of strong binders.

Aptamer selection is complex. Many variables such as library size, quantity of target, temperature, selection buffer, pH, degree of PCR amplification, and use of mutation or recombination diversification need to be considered. Due in part to these factors, less than 30% of classical SELEX experiments are successful in isolating aptamers with dissociation constants less than 30 nM [5]. Understanding the dynamics of the selection process is extremely important but what does this entail? The DNA required to fully represent the number of permutations in a 75-base aptamer library would be roughly equal to the mass of the moon [6]. In order to represent this immense sequence space, an initial SELEX library may contain up to 10 15 molecules. A rigorous empirical analysis of anything close to this number of library members is simply not feasible.

Nevertheless, empirical analyses of smaller fractions of a DNA aptamer libraries have been undertaken. The two main empirical selection analysis techniques used are high density DNA microarrays and high-throughput sequencing (HTS). Briefly, high density microarrays can contain up to approximately 1 million features, each representing an aptamer in a library. After array incubation with fluorescent target and a washing step, the binding affinity score of all aptamers on the array can be measured by fluorescence scanning. Platt et al. [7], Knight et al. [8], and Rowe et al. [9] used microarrays to both evolve aptamers and gain insight into an aptamer binding landscape. Additionally, DNA microarray data has been applied to aptamer specificity landscapes [10], fitness landscape morphology [11], and aptamer affinity maturation [12]. In comparison, the possible sequence space coverage using HTS is much greater, with Illumina’s HiSeq HTS capable of yielding sequence data for more than 70 million sequences from a single lane [13]. Using this approach the copy number of a sequence is used as a proxy for its target binding strength so that the fitness of individuals in the library pools can be estimated. Cho et al. used this HTS approach to monitor microfluidic aptamer selection rounds and gauge enrichment [14]. PCR bias may distort this copy number/binding correlation but, by using a motif based statistical framework such as MPBind [15], the binding potential of aptamers can be predicted, eliminating error from PCR bias. Although both DNA microarrays and HTS led to major breakthroughs in understanding library sequence space fitness and selection, these techniques are only capable of analysing a small fraction of a given library’s sequence space. Another approach to analysing aptamer selection is via computational simulation. The challenge in simulating aptamer selection is the design of a suitable model for aptamer binding fitness.

Computational approaches to model aptamer fitness by virtue of folding include secondary structure prediction by minimum free energy [16] and inverse folding [17]. These approaches can be computationally expensive and while being excellent models for folding they may not capture the higher complexity of molecular binding. Hoinka et al. coded a program to simulate the aptamer selection process called “AptaSim” [18]. The binding model used chose aptamer affinities at random without relevance of sequence. While AptaSim was an important step forward in simulating selection enrichment and mutation copy number, AptaSim cannot appropriately represent heritability or represent binding correlation between related sequences required for the study of genetic systems. Oh et al. used a string matching function as a binding fitness model to simulate aptamer selection [19]. This model does include heritability and binding correlation between related sequences, but as only close range epistasis is possible by using one “optimal solution” aptamer the landscape is cone shaped and would often be unrepresentative of a true aptamer binding landscape.

Kauffman’s model is a robust mathematical model, related to study of autocatalytic sets, which serves as an objective function relating genotypic sequence to phenotypic fitness [20]. Derivations of the original model have been used to describe complex interacting systems in areas as diverse as immunology [21], evolutionary biology [22], and economics [23]. The model describes a fitness landscape whose size is determined by the number of components in system ( ) and the ruggedness of the landscape can be tuned using the degree of interaction of these components ( ). This system is perhaps best described when used to represent problems in evolutionary biology as originally intended by Kauffman. A population of genomes where each contains N genes are given fitness values based on the sum of fitness contributions from each of their genes. The fitness contribution of each gene is determined by its interaction with other genes within its own genome. The interacting genes can be positioned sequentially, randomly, or by some other gene interaction pattern predetermined by an interaction map. The allelic sequence of these interacting genes is designated a fitness contribution, usually from a generated random distribution. In this way, the allelic substitution of one interacting gene means there will be a completely different fitness contribution score for the entire collection of interacting genes. In the model by increasing , the number of interactions between genes, the complexity of the system, and the ruggedness of the landscape are increased. In addition to the value of , the position of these interactions on the interaction map is of great importance to the fitness landscape.

Typically the model is used as a scoring system for a population of genomes which can evolve via diversifications such as mutation or recombination. In this way the model is an objective function for a complex system. As mentioned earlier the model can be adapted to many other areas of study. In this paper we use the model to represent binding of an aptamer to a target analyte. In this representation would be equal to length of the aptamer in the library and would be equal to the interactions of bases within each aptamer. Many modifications to the original model have been made, some to optimise the model for a given research area. Herein we will describe some modifications to the model which are aimed at optimising the model to represent binding of an aptamer to a target analyte. The model was believed to resemble molecular fitness landscapes similar to the binding of an aptamer to an analyte [24]. In the model mutational additivity usually holds for noninteracting positions in sequences. This mutational additivity is biologically accurate as has been demonstrated for several proteins [25–33].

Wedge et al. used an model for the simulation of protein directed evolution (DE) [36], a similar field to aptamer selection. Binary strings of length 40 and 100 were used with random epistatic interactions varying from = 0 to 10. Genetic algorithms utilising mutation, crossover, different library sizes, and selection pressures were simulated and compared to deduce general rules for protein directed evolution, which are of great use to DE experiments. As noted in this study, the “No Free Lunch Theorem” (NFL) [37] establishes that all search algorithms perform exactly the same when averaged over all possible problems. This infers that, for an optimisation algorithm, any elevated performance in one class of problem is exactly paid for in performance in another class. If there is discrepancy between a real life system and a model used to describe it, any elevated performance in optimisation using simulation of the model is exactly paid for in performance for the real life system. This illustrates the need for an accurate model when using simulation results to improve empirical ligand selection experiments.

Despite this biological accuracy in regard to mutational additivity, the classical model may have limitations in representing some biological systems. The model’s greatest utility is that ruggedness can be tuned using the epistasis variable . However, this epistasis is quite uniform throughout the sequence. For some biological applications, a higher amount of epistasis is desirable. As increases the landscape tends to become more multipeaked and rugged, to the point where it is too chaotic to allow adaptation. Kauffman refers to this phenomena as the “complexity catastrophe” [38]. Kauffman goes further to say “the complexity catastrophe is averted in the model for those landscapes which are sufficiently smooth to retain high optima as increases” [38]. Thinking along these lines provided a solution to the complexity catastrophe, creating complex landscapes which retained a smoother surface.

1.2. Constructional Selection of Landscapes

Altenberg developed an evolutionary approach to selecting epistatic interaction, thereby creating landscapes which were smoother than classic landscapes with the same degree of epistatic interaction [34]. Altenberg achieved this using selective genome growth, a type of constructional selection, to create modular interaction matrices. These selected matrices have reduced epistasis which give rise to smoother fitness landscapes [34, 39]. Selective genome growth is a process by which the genome of the fittest individual is expanded one gene at a time (Figure S1a in Supplementary Material available online at https://doi.org/10.1155/2017/6760852). The new gene is only kept if the fitness of a selected optimum genome is increased. In this way the probable global optima of the landscape is constructed and all other points on the landscape are relative to this optimisation. A similar method for creating landscapes was devised by Hebbron et al., which uses preferential attachment growth process to add genes to a genome [40]. A problem with these two approaches, when applied to specific applications, is that due to the increasing returns of the selection system these methods attribute extremely high pleiotropy to a handful of genes (vertical lines in Figure 2(c)). This phenomenon of increasing returns of gene control is biologically appropriate and accurate for a system describing a group of genomes, but when describing the binding of an aptamer to an analyte this high aggregated pleiotropy is not biologically appropriate. Each base in an aptamer has a relatively low number of interactions due to its spatial capacity, meaning that high aggregated pleiotropy is not biologically representative for an aptamer.

Herein we have created a new model that we have termed “selective phenome growth.” Selective phenome growth is a constructional selection technique in which phenotypic contributing factors are added to a genotype-phenotype interaction map incrementally (Figure S1b) in such a way that each new phenotypic contributing factor increases the fitness of global or local optima. Additionally, comparison is made between selective phenome growth landscapes and aptamer binding landscapes.

2. Model and Methods

2.1. Selective Phenome Growth to Create a Genotype/Phenome Interaction Map

Selective phenome growth is a new method of constructing an interaction matrix one phenotypic contributor at a time. The method or representing the interaction map is the same as Altenberg’s [34], with slight modification to represent aptamers, and is as follows: (1) The aptamer consists of

binary valued bases that have influence over

phenotypic functions, each of which contributes a component to the total fitness. (2) Each base controls a subset of the fitness components, and, in turn, each fitness component is controlled by a subset of the bases. This genotype-phenotype map can be represented by a matrix,