In version 0.19.0 you can use argument dtype='category' in read_csv:
data = 'col1,col2,col3\na,b,1\na,b,2\nc,d,3' df = pd.read_csv(pd.compat.StringIO(data), dtype='category') print (df ) col1 c ol2 col3 0 a b 1 1 a b 2 2 c d 3 print (df.dtypes) col1 category col2 category col3 category dtype: object
data = 'col1,col2,col3\na,b,1\na,b,2\nc,d,3'
df = pd.read_csv(pd.compat.StringIO(data), dtype='category')
print (df
)
col1 c
ol2 col3
0 a b 1
1 a b 2
2 c d 3
print (df.dtypes)
col1 category
col2 category
col3 category
dtype: object
If you want to specify a column for category use dtype with a dictionary, then just follow the code:
df = pd.read_csv(pd.compat.StringIO(data), dtype={'col1':'category'}) print (df) col1 col2 col3 0 a b 1 1 a b 2 2 c d 3 print (df.dtypes) col1 category col2 object col3 int64 dtype: object
df = pd.read_csv(pd.compat.StringIO(data), dtype={'col1':'category'})
print (df)
col1 col2 col3
col2 object
col3 int64
If you are interested in learning Pandas and want to become an expert in Python Programming, then check out this Python Course and upskill yourself.