Creating dummy columns #21

ammaryh92 · 2022-08-18T15:12:07Z

In chapter 26 - Reshaping DataFrames with Dummies, we wanted to turn values in the "job.role" columns into a categorical series, which we would then reshape into a dummy matrix.

That's the code of the book:

job = (jb
    .filter(like=r'job.role')
    .where(jb.isna(), 1)
    .fillna(0)
    .idxmax(axis='columns')
    .str.replace('job.role.', '', regex=False))

job

However, many rows have multiple jobs, and the above code only captures the first one.

I think the following code captures all jobs and converts them into a dummy matrix.

(jb
     .filter(like='job.role')
     .fillna('')
     .apply(lambda ser: ','.join([i for i in ser if i]), axis=1)
     .str.get_dummies(sep=',')
)

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Creating dummy columns #21

Creating dummy columns #21

ammaryh92 commented Aug 18, 2022

Creating dummy columns #21

Creating dummy columns #21

Comments

ammaryh92 commented Aug 18, 2022