i have data frame looks this:
my_df[1,]
gene_id ensg00000171680.16; transcript_id enst00000400915.3; gene_type protein_coding; gene_status known; gene_name plekhg5; transcript_type protein_coding; transcript_status known; transcript_name plekhg5-002; exon_number 4; exon_id ense00003634700.1; level 2; protein_id ensp00000383706.3; tag basic; tag appris_candidate; tag ccds; ccdsid ccds41241.1; havana_gene otthumg00000000905.3; havana_transcript otthumt00000002631.1;my_df[2,]
gene_id ensg00000173662.15; transcript_id ensg00000173662.15; gene_type protein_coding; gene_status known; gene_name tas1r1; transcript_type protein_coding; transcript_status known; transcript_name tas1r1; level 1; havana_gene otthumg00000001441.2; 7734 levels: gene_id ensg00000007923.11; transcript_id ensg00000007923.11; gene_type protein_coding; gene_status known; gene_name dnajc11; transcript_type protein_coding; transcript_status known; transcript_name dnajc11; level 2; havana_gene otthumg00000001443.3; ...
my_df[n,]
................
i subset ensg* contained in each row (14.000 rows , 1 column) of full data frame. tried use grep function unfortunately gives integer(0).
expected output:
gene_id ensg00000007923.11
gene_id ensg00000173662.15
.............
can please me solve issue?
kind regards
i think must use pattern as:
pattern <- '[e][n][s][g]'
use pattern in grep() function return matching row numbers.
accordingly, can subset data frame.
Comments
Post a Comment