Analyzing Zoom meeting data

Before you turn this problem in, make sure everything runs as expected. First, restart the kernel (in the menubar, select Kernel\(\rightarrow\)Restart) and then run all cells (in the menubar, select Cell\(\rightarrow\)Run All).

Make sure you fill in any place that says YOUR CODE HERE or “YOUR ANSWER HERE”, as well as your name and collaborators below:

NAME = ""
COLLABORATORS = ""

We held an online symposium on Catalysis (https://www.youtube.com/channel/UCURlr_JpZvatrGpwaDJVGJw/videos) using Zoom. Zoom collects a lot of data on the meetings, and a sanitized (meaning all personally identifying information has been stripped from it) is available in the file .

In this data file, each attendee is represented by a random string in the Email column.

Your tasks are to read this data into Pandas, and then use it to answer the following questions.

One hint is to use an arg to read_csv like na_values=’–’, which will convert those strings to NaN, which is easier to work with.

use df.describe() to summarize the DataFrame.

How many people registered for the meeting (this is every unique email)?

How many people attended the symposium (this is every unique email with a numeric Time in Session)?

Make a list of the countries they were from, and count how many there are.

Plot a distribution (histogram) of the total time in session for each user. Also compute the average time spent in the session.

When you are done, download a PDF and turn it in on Canvas. Make sure to save your notebook, then run this cell and click on the download link.

%run ~/s24-06642/s24.py
%pdf