Artwork

Conteúdo fornecido por Sanket Gupta. Todo o conteúdo do podcast, incluindo episódios, gráficos e descrições de podcast, é carregado e fornecido diretamente por Sanket Gupta ou por seu parceiro de plataforma de podcast. Se você acredita que alguém está usando seu trabalho protegido por direitos autorais sem sua permissão, siga o processo descrito aqui https://pt.player.fm/legal.
Player FM - Aplicativo de podcast
Fique off-line com o app Player FM !

17: Why Pandas is the new Excel

16:37
 
Compartilhar
 

Manage episode 245389209 series 2550866
Conteúdo fornecido por Sanket Gupta. Todo o conteúdo do podcast, incluindo episódios, gráficos e descrições de podcast, é carregado e fornecido diretamente por Sanket Gupta ou por seu parceiro de plataforma de podcast. Se você acredita que alguém está usando seu trabalho protegido por direitos autorais sem sua permissão, siga o processo descrito aqui https://pt.player.fm/legal.

The Data Life Podcast is a podcast where we talk all-about real life experiences with data and data science science tools, techniques, models and personalities.

In this episode, we will talk about how Pandas is becoming a tool of choice for many data scientists for doing their data analysis work. We will explore how Pandas wins over Excel in several key areas that are important for businesses today:

1) Large dataset sizes
2) Different kinds of input formats such as JSON, CSV, HTML, SQL etc
3) Complex business logic
4) Linking data analysis work to websites and databases
5) Cost

Pandas has lots of helpful functions such as read_csv, read_json, read_sql that allow easy input of data into dataframes. DataFrames have several useful methods like "describe", "value_counts", "groupby", "loc" and more that allow easy understanding of your dataset. It also supports plotting out of the box with "plot" method.
We also cover how Pandas differs from SQL in things like ease of handling time series data, visualizations and more.
Tune in to the episode to learn more about how Pandas might be the tool for your data analysis needs to take your business to next level!

Fantastic Resources:
1) Book by Pandas creator Wes McKinney: https://www.amazon.com/dp/1491957662/?tag=omnilence-20
2) Great workshop video by Kevin Markham in PyCon: https://www.youtube.com/watch?v=0hsKLYfyQZc
3) Input output methods for Pandas: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html
4) Comparison of some operations of Pandas with SQL https://pandas.pydata.org/pandas-docs/stable/getting_started/comparison/comparison_with_sql.html

Thanks for listening! Please consider supporting this podcast from the link in the end.

--- Send in a voice message: https://podcasters.spotify.com/pod/show/the-data-life-podcast/message Support this podcast: https://podcasters.spotify.com/pod/show/the-data-life-podcast/support
  continue reading

27 episódios

Artwork
iconCompartilhar
 
Manage episode 245389209 series 2550866
Conteúdo fornecido por Sanket Gupta. Todo o conteúdo do podcast, incluindo episódios, gráficos e descrições de podcast, é carregado e fornecido diretamente por Sanket Gupta ou por seu parceiro de plataforma de podcast. Se você acredita que alguém está usando seu trabalho protegido por direitos autorais sem sua permissão, siga o processo descrito aqui https://pt.player.fm/legal.

The Data Life Podcast is a podcast where we talk all-about real life experiences with data and data science science tools, techniques, models and personalities.

In this episode, we will talk about how Pandas is becoming a tool of choice for many data scientists for doing their data analysis work. We will explore how Pandas wins over Excel in several key areas that are important for businesses today:

1) Large dataset sizes
2) Different kinds of input formats such as JSON, CSV, HTML, SQL etc
3) Complex business logic
4) Linking data analysis work to websites and databases
5) Cost

Pandas has lots of helpful functions such as read_csv, read_json, read_sql that allow easy input of data into dataframes. DataFrames have several useful methods like "describe", "value_counts", "groupby", "loc" and more that allow easy understanding of your dataset. It also supports plotting out of the box with "plot" method.
We also cover how Pandas differs from SQL in things like ease of handling time series data, visualizations and more.
Tune in to the episode to learn more about how Pandas might be the tool for your data analysis needs to take your business to next level!

Fantastic Resources:
1) Book by Pandas creator Wes McKinney: https://www.amazon.com/dp/1491957662/?tag=omnilence-20
2) Great workshop video by Kevin Markham in PyCon: https://www.youtube.com/watch?v=0hsKLYfyQZc
3) Input output methods for Pandas: https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html
4) Comparison of some operations of Pandas with SQL https://pandas.pydata.org/pandas-docs/stable/getting_started/comparison/comparison_with_sql.html

Thanks for listening! Please consider supporting this podcast from the link in the end.

--- Send in a voice message: https://podcasters.spotify.com/pod/show/the-data-life-podcast/message Support this podcast: https://podcasters.spotify.com/pod/show/the-data-life-podcast/support
  continue reading

27 episódios

Todos os episódios

×
 
Loading …

Bem vindo ao Player FM!

O Player FM procura na web por podcasts de alta qualidade para você curtir agora mesmo. É o melhor app de podcast e funciona no Android, iPhone e web. Inscreva-se para sincronizar as assinaturas entre os dispositivos.

 

Guia rápido de referências