gfold 分析

您所在的位置:网站首页 awk汇总 gfold 分析

gfold 分析

2023-08-11 11:08| 来源: 网络整理| 查看: 265

之前分析遇到2个问题: 1.样本只有一个,没有replicates,是不是就没有办法进行分析? 2.一个样本有3个重复,另外一个只有1个,怎么进行分析?

Jimmy老师给出了提示,使用gfold进行分析是可行的,并且还有安装说明

http://www.biotrainee.com/thread-752-1-1.html 参考文献是这篇-gfold https://pubmed.ncbi.nlm.nih.gov/22923299/

Gfold 软件安装和使用方法 安装软件 gfold cd /public/vip/biosoft/ wget http://mirrors.ocf.berkeley.edu/gnu/gsl/gsl-latest.tar.gz tar -zxvf gsl-latest.tar.gz cd gsl-2.7/ ./configure --prefix=/public/vip/biosoft/gsl-2.7 make make install cd ../ wget https://zhanglab.tongji.edu.cn/softwares/GFOLD/gfold.V1.1.4.tar.gz tar -zxf gfold.V1.1.4.tar.gz cd gfold.V1.1.4/ make export CXXFLAGS="-g -O3 -I//public/vip/biosoft/gsl-2.7/include -L//public/vip/biosoft/gsl-2.7/lib" export LD_LIBRARY_PATH="/public/vip/biosoft/gsl-2.7/lib:"$LD_LIBRARY_PATH source ~/.bashrc g++ -O3 -Wall -g main.cc -o gfold -lgsl -lgslcblas -I/public/vip/biosoft/gsl-2.7/include -L/public/vip/biosoft/gsl-2.7/lib 把环境变量增加到系统中去 vi ~/.bashrc export CXXFLAGS="-g -O3 -I//public/vip/biosoft/gsl-2.7/include -L//public/vip/biosoft/gsl-2.7/lib" export LD_LIBRARY_PATH="/public/vip/biosoft/gsl-2.7/lib:"$LD_LIBRARY_PATH source ~/.bashrc 可以直接运行了 /public/vip/biosoft/gfold.V1.1.4/gfold -h 软件使用说明

Example 3: Identify differentially expressed genes without replicates

gfold diff -s1 sample1 -s2 sample2 -suf .read_cnt -o sample1VSsample2.diff

Example 4: Identify differentially expressed genes with replicates

gfold diff -s1 sample1,sample2,sample3 -s2 sample4,sample5,sample6 -suf .read_cnt -o 123VS456.diff

Example 5: Identify differentially expressed genes with replicates only in one condition

Only the first group contains replicates. In this case, the variance estimated based on the first group will be used as the variance of the second group

gfold diff -s1 sample1,sample2 -s2 sample3 -suf .read_cnt -o 12VS3.diff gfold diff -s1 sample1,sample2 -s2 sample3 -suf .read_cnt -o 12VS3.diff 对输入文件的需求:

All fields in a output file are separated by TABs.

The output file contains 5 columns :共有5列,顺序如下

1.GeneSymbol

2.GeneName

3.Read Count

4.Gene exon length

5.RPKM

输出文件是这样的

生成文件:

1 GeneSymbol

2 GeneName

3 GFOLD: GFOLD value for every gene. The GFOLD value could be considered as a reliable log2 fold change. It is positive/negative if the gene is up/down regulated. The main usefulness of GFOLD is to provide a biological meanlingful ranking of the genes. The GFOLD value is zero if the gene doesn't show differential expression. If the log2 fold change is treated as a random variable, a positive GFOLD value x means that the probability of the log2 fold change (2nd/1st) being larger than x is (1 - the paramete specified by -sc); A negative GFOLD value the parameter specified by -sc). If this file is sorted by this column in descending order then genes ranked at the top are differentially up-regulated and genes ranked at the bottom are differentially down-regulated. Note that a gene with GFOLD value 0 should never be considered differentially expressed. However, it doesn't mean that all genes with non-negative GFOLD value are differentially expressed. For taking top differentially expressed genes, the user is responsible for selecting the cutoff. 4 E-FDR: Empirical FDR based on replicates. It is always 1 when no replicates are available

5 log2fdc: log2 fold change. If no replicate is available, and -acc is T, log2 fold change is based on read counts and normalization constants. Otherwise, log2 fold change is based on the sampled expression level from the posterior distribution.

6: 1stRPKM: The RPKM for the first condition. It is available only if gene length is available. If multiple replicates are available, the RPKM is calculated simply by summing over replicates.Because RPKM is acturally using sequencing depth as the normalization constant, log2 fold change based on RPKM could be different from the log2fdc field.

7:2ndRPKM: The RPKM for the second condition. It is available only if gene length is available. Please refer to 1stRPKM for more information.

第一步:准备在R语言中完成数据的预处理,并读取数据 rm(list = ls()) chenshu R34_R2.diff.up.txt awk '{if ($3>1 && $4=1) print $0}' R34_R2.diff OFS='\t' >R34_R2.diff_2_up.txt awk '{if ($30 && $4=1) print $0}' C123_D2.diff OFS='\t' >C123_D2.diff.up.txt awk '{if ($3>1 && $4=1) print $0}' C123_D2.diff OFS='\t' >C123_D2.diff_2_up.txt awk '{if ($30 && $4=1) print $0}' C123_R34.diff OFS='\t' >C123_R34.diff.up.txt awk '{if ($3>1 && $4=1) print $0}' C123_R34.diff OFS='\t' >C123_R34.diff_2_up.txt awk '{if ($3


【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3