You can write a function that takes a row in as a parameter, tests whatever conditions you want to test, and returns a True
or False
result - which you can then use as a selection tool. (Though on rereading of your question, this may not be what you're looking for - see part 2 below)
Perform a Selection
apply
this function to your dataframe, and use the returned series of True/False answers as an index to select values from the actual dataframe itself.
e.g.
def selector(row):
if row['a'] > 0 and row['b'] == 3 :
return True
elif row['c'] > 2:
return True
else:
return False
You can build whatever logic you like, just ensure it returns True when you want a match and False when you don't.
Then try something like
df.apply(lambda row : selector(row), axis=1)
And it will return a Series of True-False answers. Plug that into your df to select only those rows that have a True
value calculated for them.
df[df.apply(lambda row : selector(row), axis=1)]
And that should give you what you want.
Part 2 - Perform a Calculation
If you want to create a new column containing some calculated result - then it's a similar operation, create a function that performs your calculation:
def mycalc(row):
if row['a'] > 5 :
return row['a'] + row['b']
else:
return 66
Only this time, apply
the result and assign it to a new column name:
df['value'] = df.apply( lambda row : mycalc(row), axis = 1)
And this will give you that result.