1.6 KiB
1.6 KiB
Pandas tips
Pandas will be used all the time in jupyter notebooks and datasets. So it's best to get familiar with it while you can.
- It is useful to know the common ways it is used first.
- Also merging cells when you should is helpful.
- Geocoding spatial data
read_csv()
copy()
df2=df1.copy()
df2['b']=df2['b']+100
df2
df1
concat()
df3=pd.concat([df1,df2])
df3
If you have multiple files to deal with, you can also combine pd.concat and pd.read_csv
for i in path_data.glob("*.csv"):
print(i)
flightlist = pd.concat(pd.read_csv(file) for file in path_data.glob("*.csv"))
value_counts()
Used to count unique values.
df['callsign'].value_counts()
!
can also be normalized by setting normalize=True
df['callsign'].value_counts(normalize=True)
!
Can also be used for continuous data by putting them into discrete intervals using bins
df['altitude_1'].value_counts(bins=10)
- More tips for data analysis can be found here
- including data missing %, max values rows, aggregate across columns, and more