python - Removing charaters from a txt file -
how open txt file , remove som special characters form tweets txt file.
my text looks somthing this
@xirwinshemmo follow :) hii... if u want make new friend add me on facebook! :) xx https:\/\/t.co\/rcyfvrmddg @ycmap enjoy tmrro. saw them earlier wk here in tokyo :)
i have rid of starts @ , every webpage (http) how do that?
i have tried far.
import re = [] open('englishtweets1.txt','r') inf: = inf.readlines() line in a: line = re.sub(r['@'], line)
use this
import re data = open('englishtweets1.txt').read() new_str = re.sub(r'^@', ' ', data) new_str = re.sub(r'^https?:\/\/.*[\r\n]*', '', new_str, flags=re.multiline) #open('removed.txt', 'w').write(new_str) (if needed)
update working tested
new_str = re.sub(r'https.(.*?) ', '', new_str, flags=re.multiline)
Comments
Post a Comment