I am attempting to create a new column based on conditional logic from another column. I've tried searching and haven't been able to find anything that addresses my issue.
I have imported a CSV to a pandas dataframe, it is structured like this. I edited a few of the descriptions for this post, but other than that everything is the same:
#code used to load dataframe:
df = pd.read_csv(r"C:\filepath\filename.csv")
#output from print(type(df)):
#class 'pandas.core.frame.DataFrame'
#output from print(df.columns.values):
#['Type' 'Trans Date' 'Post Date' 'Description' 'Amount']
#output from print(df.columns):
Index(['Type', 'Trans Date', 'Post Date', 'Description', 'Amount'], dtype='object')
#output from print
Type Trans Date Post Date Description Amount
0 Sale 01/25/2018 01/25/2018 DESC1 -13.95
1 Sale 01/25/2018 01/26/2018 AMAZON MKTPLACE PMTS -6.99
2 Sale 01/24/2018 01/25/2018 SUMMIT BISTRO -5.85
3 Sale 01/24/2018 01/25/2018 DESC3 -9.13
4 Sale 01/24/2018 01/26/2018 DYNAMIC VENDING INC -1.60
I then write the following code:
def criteria(row):
if row.Description.find('SUMMIT BISTRO')>0:
return 'Lunch'
elif row.Description.find('AMAZON MKTPLACE PMTS')>0:
return 'Amazon'
elif row.Description.find('Aldi')>0:
return 'Groceries'
else:
return 'NotWorking'
df['Category'] = df.apply(criteria, axis=0)
Errors:
Traceback (most recent call last):
File "C:\Users\Test_BankReconcile2.py", line 44, in <module>
df['Category'] = df.apply(criteria, axis=0)
File "C:\Users\Anaconda3\lib\site-packages\pandas\core\frame.py", line 4262, in apply
ignore_failures=ignore_failures)
File "C:\Users\Anaconda3\lib\site-packages\pandas\core\frame.py", line 4358, in _apply_standard
results[i] = func(v)
File "C:\Users\OneDrive\Documents\finance\Test_BankReconcile2.py", line 35, in criteria
if row.Description.find('SUMMIT BISTRO')>0:
File "C:\Users\Anaconda3\lib\site-packages\pandas\core\generic.py", line 3081, in __getattr__
return object.__getattribute__(self, name)
AttributeError: ("'Series' object has no attribute 'Description'", 'occurred at index Type')
I'm able to successfully execute this same sort of command on a very similar csv file from a different bank (this example is from my credit card), so I don't know what is going on but possibly I need to define the dataframe in some way that I'm not doing? Or possibly something else that is very obvious that I'm not seeing? Thank you all in advance for helping me solve this.
df.apply(criteria, axis=1)
df.apply(func, axis=0)
(axis=0
is default) applies the functionfunc
to each column ofdf
(the columns of a DataFrame are series). So your functioncriteria(row)
isn't actually receiving a row, but a column. Changing toaxis=1
should fix things.