Computer Science, asked by xiyogid350, 10 months ago

Explain the following:
a) df[df['C']>23]. index. tolist()
b) plt.hist(data,bins=(0,10,20,30),edgecolor="yellow",orientation=
'horizontal')
c) a1.add (a2, fill_value=0)
d) Cur=db.cursor()
e) df = pd.DataFrame(dict, index = [0, 1, 2, 3])
mask = df.index == 0
print(df[mask])

Answers

Answered by Hemalathajothimani
3

Answer:

Explanation:

Creating Histograms using Pandas

When exploring a dataset, you’ll often want to get a quick understanding of the distribution of certain numerical variables within it. A common way of visualizing the distribution of a single numerical variable is by using a histogram. A histogram divides the values within a numerical variable into “bins”, and counts the number of observations that fall into each bin. By visualizing these binned counts in a columnar fashion, we can obtain a very immediate and intuitive sense of the distribution of values within a variable.

This recipe will show you how to go about creating a histogram using Python. Specifically, you’ll be using pandas hist() method, which is simply a wrapper for the matplotlib pyplot API.

In our example, you’re going to be visualizing the distribution of session duration for a website. The steps in this recipe are divided into the following sections:

Data Wrangling

Data Exploration & Preparation

Data Visualization

You can find implementations of all of the steps outlined below in this example Mode report. Let’s get started.

Data Wrangling

You’ll use SQL to wrangle the data you’ll need for our analysis. For this example, you’ll be using the sessions dataset available in Mode’s Public Data Warehouse. Using the schema browser within the editor, make sure your data source is set to the Mode Public Warehouse data source and run the following query to wrangle your data:

select *

from modeanalytics.sessions

Once the SQL query has completed running, rename your SQL query to Sessions so that you can easily identify it within the Python notebook.

Data Exploration & Preparation

Now that you have your data wrangled, you’re ready to move over to the Python notebook to prepare your data for visualization. Inside of the Python notebook, let’s start by importing the Python modules that you’ll be using throughout the remainder of this recipe:

import numpy as np

import pandas as pd

import matplotlib.pyplot as plt

from matplotlib.ticker import StrMethodFormatter

Mode automatically pipes the results of your SQL queries into a pandas dataframe assigned to the variable datasets. You can use the following line of Python to access the results of your SQL query as a dataframe and assign them to a new variable:

df = datasets['Sessions']

You can get a sense of the shape of your dataset using the dataframe shape attribute:

df.shape

Calling the shape attribute of a dataframe will return a tuple containing the dimensions (rows x columns) of a dataframe. In our example, you can see that the sessions dataset we are working with is 200,000 rows (sessions) by 6 columns. You can in vestigate the data types of the variables within your dataset by calling the dtypes attribute:

df.dtypes

Calling the dtypes attribute of a dataframe will return information about the data types of the individual variables within the dataframe. In our example, you can see that pandas correctly inferred the data types of certain variables, but left a few as object data type. You have the ability to manually cast these variables to more appropriate data types:

# Data type conversions

df['created_at'] = df['created_at'].astype('datetime64[ns]')

df['user_type'] = df['user_type'].astype('category')

# Show new data types

df.dtypes

Now that you have our dataset prepared, we are ready to visualize the data.

Data Visualization

To create a histogram, we will use pandas hist() method. Calling the hist() method on a pandas dataframe will return histograms for all non-nuisance series in the dataframe:

Answered by xcristianox
2
  • When exploring a dataset, you’ll often want to get a quick understanding of the distribution of certain numerical variables within it. A common way of visualizing the distribution of a single numerical variable is by using a histogram. A histogram divides the values within a numerical variable into “bins”, and counts the number of observations that fall into each bin. By visualizing these binned counts in a columnar fashion, we can obtain a very immediate and intuitive sense of the distribution of values within a variable.

  • This recipe will show you how to go about creating a histogram using Python. Specifically, you’ll be using pandas hist() method, which is simply a wrapper for the matplotlib pyplot API.

  • In our example, you’re going to be visualizing the distribution of session duration for a website. The steps in this recipe are divided into the following sections:

Similar questions