17 Data analysis and visualization (3)
Resources |
Revision Questions |
Information Technology IT
Login to see all questions
Click on a question to view the answer
1.
Question 1
A retail company has collected sales data for the last year. The data includes information on product category, region, and sales revenue. You need to create a data visualization to identify the top 5 performing product categories by total sales revenue. Describe the steps you would take using a spreadsheet software (e.g., Microsoft Excel, Google Sheets) to achieve this, including the specific functions or features you would use. Include a description of the resulting chart type and why it is appropriate for this task.
To identify the top 5 performing product categories, I would follow these steps:
- Data Preparation: Ensure the data is in a tabular format with columns for 'Product Category' and 'Sales Revenue'.
- Pivot Table Creation: Select the entire dataset and create a pivot table. Drag 'Product Category' to the 'Rows' area and 'Sales Revenue' to the 'Values' area. Ensure 'Sales Revenue' is set to 'Sum'.
- Sorting: Sort the pivot table in descending order based on the 'Sum of Sales Revenue' column.
- Filtering: Apply a filter to the pivot table to display only the top 5 rows.
- Chart Creation: Select the data from the pivot table (specifically the 'Product Category' and 'Sum of Sales Revenue' columns). Create a bar chart. The 'Product Category' will be on the X-axis and 'Sum of Sales Revenue' will be on the Y-axis.
Chart Type Justification: A bar chart is appropriate because it allows for a clear visual comparison of the sales revenue across different product categories. The height of each bar directly represents the total sales revenue, making it easy to identify the top performers.
2.
Question 2
A university wants to analyze student enrollment data. The data includes information on student year of study (e.g., First Year, Second Year, Third Year), department, and number of students enrolled. Design a pivot table and chart solution to answer the following questions:
- What is the total number of students enrolled in each department?
- What is the distribution of students across different years of study within each department?
Clearly specify the pivot table layout and the type of chart you would use for each question. Provide a brief explanation of why your chosen chart type is suitable.
Question 1: Total Students per Department
Pivot Table Layout:
- Rows: Department
- Values: Count of Students (ensure the 'Count of Students' field is selected)
Chart Type: A bar chart is suitable. The department names will be on the X-axis and the number of students will be on the Y-axis. This allows for a direct comparison of enrollment numbers between departments.
Question 2: Distribution of Students by Year within Department
Pivot Table Layout:
- Rows: Department
- Columns: Year of Study
- Values: Count of Students
Chart Type: A stacked bar chart is appropriate. Each bar will represent a department, and the segments within each bar will represent the number of students in each year of study. This visually shows the proportion of students in each year within each department.
3.
Question 3
You are given a dataset containing website traffic data. The data includes the date, source of the traffic (e.g., Google, Facebook, Direct), and the number of page views. You need to create a data visualization to identify the trend in website traffic over time and compare the traffic from different sources. Describe how you would use a pivot table and chart to achieve this. Specify the pivot table layout and the chart type you would use, explaining your rationale.
To analyze website traffic trends and source comparisons, I would:
- Pivot Table Layout:
- Rows: Date (or a date hierarchy - Year, Month, Day)
- Columns: Source
- Values: Sum of Page Views
- Chart Type: A line chart is the most suitable for visualizing website traffic trends over time. The date (or date hierarchy) will be on the X-axis, and the 'Sum of Page Views' will be on the Y-axis. Separate lines will be plotted for each traffic source (Google, Facebook, Direct).
Rationale: A line chart effectively displays changes in traffic over time, allowing for easy identification of peaks, troughs, and overall trends. The separate lines for each source enable a direct comparison of traffic performance from different channels. This helps determine which sources are most effective at driving website traffic.