Python to search a string for the first occurrence of any item in a list -
i have parse few thousand txt documents using python, right i'm getting code working one.
i trying find first time month (january, february, march, etc) appears in document, , return position of first month. every document has @ least 1 month in it, have many months.
this works currently, seems cumbersome:
mytext = open('2.txt','r') mytext = mytext.read() january = mytext.find("january") february = mytext.find("february") march = mytext.find("march") april = mytext.find("april") may = mytext.find("may") june = mytext.find("june") july = mytext.find("july") august = mytext.find("august") september = mytext.find("september") october = mytext.find("october") november = mytext.find("november") december = mytext.find("december") monthpos = [january, february, march, april, may, june, july, august, september, october, november, december] monthpos = [x x in monthpos if x != -1] print min(monthpos) # returns first match number
i combine any() , find() job done, there doesn't seem better way this. found this question isn't clear, didn't much. while know wrong , not work many reasons, here want do:
mytext = open('text.txt','r') mytext = mytext.read() months = ["january", "february", "march", "april", "may", "june", "july", "august", "september", "october", "november", "december"] print mytext.find(months) #where find first time month matched 1945 # return location in string first month found
thanks in advance.
i think want:
months = ["january", "february", "march", "april", "may", "june", "july", "august", "september", "october", "november", "december"] indices = [s.find(month) month in months] first = min(index index in indices if index > -1)
first, first appearance of each month (or -1
if not present), minimum of indices, except it's -1
. throw valueerror
if none found, may or may not want.
as two-bit alchemist has commented, short-cut efficiency:
months = ["january", "february", "march", "april", "may", "june", "july", "august", "september", "october", "november", "december"] first = none month in sorted(months, key=len): = s[:first].find(month) # search first part of string if != -1: if < first or first none: first = if < len(month): # not enough room remaining months break
Comments
Post a Comment