pandas if else conditions on multiple columns [duplicate]

Question

suppose i have below df:

import pandas as pd

data_dic = {
    "a": [0,0,1,2],
    "b": [0,3,4,5],
    "c": [6,7,8,9]
}
df = pd.DataFrame(data_dic)

Result:

I need to past value to new column from above columns based on conditions:

if df.a > 0 then value df.a
else if df.b > 0 then value df.b 
else value df.c

For now i try with:

df['value'] = [x if x > 0 else 'ww' for x in df['a']]

but don't know how to input more conditions in this.

Expected result:

   a  b  c value
0  0  0  6  6
1  0  3  7  3
2  1  4  8  1
3  2  5  9  2

Thank You for hard work.

This is definitely NOT the same question as stackoverflow.com/questions/19913659 as marked above. This question asks how to use a default value that is from an existing column which was what I was looking for. The other question uses a pre-defined set of choices based on the condition which didn't help me. — Michael Szczepaniak, Commented Nov 10, 2023 at 23:27

jezrael · Accepted Answer · 2019-08-08 09:29:28Z

17

Use numpy.select:

df['value'] = np.select([df.a > 0 , df.b > 0], [df.a, df.b], default=df.c)
print (df)
   a  b  c  value
0  0  0  6      6
1  0  3  7      3
2  1  4  8      1
3  2  5  9      2

Difference between vectorized and loop solutions in 400k rows:

df = pd.concat([df] * 100000, ignore_index=True)

In [158]: %timeit df['value2'] = np.select([df.a > 0 , df.b > 0], [df.a, df.b], default=df.c)
9.86 ms ± 611 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [159]: %timeit df['value1'] = [x if x > 0 else y if y>0 else z for x,y,z in zip(df['a'],df['b'],df['c'])]
399 ms ± 52.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

edited Aug 8, 2019 at 9:29

answered Aug 8, 2019 at 9:23

jezrael

854k100 gold badges1.4k silver badges1.3k bronze badges

1

I knew the numpy way would be faster, that is a lot lot lot faster even in a small df
– Neo
Commented Aug 8, 2019 at 9:34

Add a comment |

Neo · Accepted Answer · 2019-08-08 09:26:39Z

6

You can also use list comprehension:

df['value'] = [x if x > 0 else y if y>0 else z for x,y,z in zip(df['a'],df['b'],df['c'])]

answered Aug 8, 2019 at 9:26

Neo

6273 silver badges7 bronze badges

Thank You for nice solution
– Zaraki Kenpachi
Commented Aug 8, 2019 at 9:35
It is intuitive however the answer from @jezrael is a lot better performance-wise, so please accept it.
– Neo
Commented Aug 8, 2019 at 9:39

Add a comment |

Thomas Kimber · Accepted Answer · 2019-08-08 09:34:47Z

You can write a function that takes a row in as a parameter, tests whatever conditions you want to test, and returns a True or False result - which you can then use as a selection tool. (Though on rereading of your question, this may not be what you're looking for - see part 2 below)

Perform a Selection

apply this function to your dataframe, and use the returned series of True/False answers as an index to select values from the actual dataframe itself.

e.g.

def selector(row):
    if row['a'] > 0 and row['b'] == 3 :
        return True
    elif row['c'] > 2:
        return True
    else:
        return False

You can build whatever logic you like, just ensure it returns True when you want a match and False when you don't.

Then try something like

df.apply(lambda row : selector(row), axis=1)

And it will return a Series of True-False answers. Plug that into your df to select only those rows that have a True value calculated for them.

df[df.apply(lambda row : selector(row), axis=1)]

And that should give you what you want.

Part 2 - Perform a Calculation

If you want to create a new column containing some calculated result - then it's a similar operation, create a function that performs your calculation:

def mycalc(row):
    if row['a'] > 5 :
        return row['a'] + row['b']
    else:
        return 66

Only this time, apply the result and assign it to a new column name:

df['value'] = df.apply( lambda row : mycalc(row), axis = 1)

And this will give you that result.

Collectives™ on Stack Overflow

pandas if else conditions on multiple columns [duplicate]

3 Answers 3

Not the answer you're looking for? Browse other questions tagged
python
pandas
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Not the answer you're looking for? Browse other questions tagged pythonpandas or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
pandas
or ask your own question.