python - Merge fields in a file -


i have file 7 columns, gff file having chromosomal regions.i want collapse rows region ="exon" 1 row in file.the row has collapsed on basis of regions being overlapping each other.

region  start   end  score strand frame     attribute  exon   26453   26644   .   +   .   transcript "xm_092971"; name "xm_092971"  exon   26842   27020   .   +   .   transcript "xm_092971"; name "xm_092971"  exon   30355   30899   .   -   .   transcript "xm_104663"; name "xm_104663"  gs_tran    30355   34083   .   -   .   gs_tran "hs22_30444_28_1_1"; name "hs22_30444_28_1_1"  snp    30847   30847   .   +   .   snp "rs2971719"; name "rs2971719"  exon   31012   31409   .   -   .   transcript "xm_104663"; name "xm_104663"  exon   34013   34083   .   -   .   transcript "xm_104663"; name "xm_104663"  exon   40932   41071   .   +   .   transcript "xm_092971"; name "xm_092971"  snp    44269   44269   .   +   .   snp "rs2873227"; name "rs2873227"  snp    45723   45723   .   +   .   snp "rs2227095"; name "rs2227095"  exon   134031  134495  .   -   .   transcript "xm_086913"; name "xm_086913"              exon   134034  134457  .   -   .   transcript "xm_086914"; name "xm_086914"             

looking @ sample data above,only last 2 rows can merged 1 row.so,the new row become.

exon    134031  134495  .   -   .   transcript "xm_086913"; name "xm_086913"             

in case,the end of other row have been greater previous,that end region in case.basically,if there overlap,then take region starts earlier,and 1 ends later.

there can multiple rows of such instance,here last 2 rows there.one thing atrribute column show different transcript names such rows,which same in other cases.

i have in python,and beginner in python.

break down simpler steps:

  • read file , parse list of data
  • loop list , check each row against next
  • append ones fullfill requirements new list
  • save new list new file or print console

you might want manually move through list instead of using for row in mylist this:

newlist = [] = 0 while < len(mylist):      if can_collapse( mylist[i], mylist[i+1] ):          newlist.append[ collapse( mylist[i], mylist[i+1] ) ]          += 2      else:          newlist.append[ mylist[i] ]          += 1 

Comments

Popular posts from this blog

java - WrongTypeOfReturnValue exception thrown when unit testing using mockito -

php - Magento - Deleted Base url key -

android - How to disable Button if EditText is empty ? -