lundi 20 avril 2015

extract fraction and 2 or 4 digit year from mulitple formated strings

I have 130,000+ plus strings which contain measurements such as 3/4", 1", etc and house number such as 5648 or 222 and then years formatted ilk 02, 92 or 2004 depending upon what the user felt like typing in that day. I also have \ and - in there randomly just to make it more fun.

What I need is: the first measurement value ie 3/4" or 2" and the year 02 or 1997. I have tried multiple splits and replaces but I don't seem to be getting very far. I have most of the measurements pulled out using a split at ". Any help would be nice. Someone suggested regular expressions but I have never used these.

Here are some examples:

3/4"-6235\PE-03, 
1"-8018\ \PE-00, 
3/4"-    \ \PE-2004, 
1"-11769\ \74\COPSET, 
PE-85, 
1"-BLDG 1, 
COMM CABLE

Here is what I have currently.

for featureToTotal in featuresToTotal:
                    id = id + 1
                    # Get each Water Type Time Total 
                    try:                          
                        ValueOne = featureToTotal[1]
                        tmpvalue = ValueOne.replace("\\", "")
                        tmpvalue = tmpvalue.replace("-", " ")
                        tmpvalue = tmpvalue.replace("'", " ")
                        newValue = tmpvalue.decode('string_escape')

                        splitOne = newValue.split('\\')[0]
                        Split2 = splitOne.split('-')[0]                            
                        trysplit = Split2.split('"')[0]
                        #Test for Number                             
                        try:
                            num = trysplit[:1]
                            float(num)
                            strval = str(trysplit)
                            trysplit = strval
                            #featureToTotal[4] = strval
                            #arcpy.AddMessage(str(trysplit)) 
                            #featuresToTotal.updateRow(featureToTotal)
                        except:
                            errstrr = "yep"
                            #print "Nope" + ValueOne +  " " + trysplit

                        buildqury = "INSERT INTO Annos VALUES(" + str(id) + ", ''" + newValue + "'', ''" + trysplit+ "'', ''" + YearTest +  "'')"
                        cur.execute(buildqury)
                    except:
                        strerr = sys.exc_value.message
                        print "Error Splitting  " 

Aucun commentaire:

Enregistrer un commentaire