Getting started with learning Python as a data scientist can feel overwhelming, with new jargon flying everywhere even for the most basics tasks. It is therefore essential to first have an idea of the basic libraries that exist, why they do and when to use them before taking on the task of using them in your code. Here is a brief and hopefully helpful intro to the most common libraries in Python for beginners.

Pandas

Pandas is an open source library developed by Wes McKinney, see book here in 2008 that is suitable for working with tabular data. Pandas has 2 main data structures(a container that holds data in a specific way) namely Data frames and Series. A pandas Series is a one-dimensional array of labelled data with an index attached to it, whilst a data frame is a two-dimensional array of data, consisting of multiple series.

Here is an example of a pandas series

0          Apple
1         Banana
2         Cherry
3    Dragonfruit
dtype: object

And here is one of a dataframe

fruit  quantity price
0        Apple        25   $30
1       Banana        30   $10
2       Cherry        30   $20
3  Dragonfruit         5   $50