Posts

Showing posts from October, 2025

Module # 9 assignment

Image
 For this assignment, I used the mtcars dataset, which includes data on various car models and their performance characteristics such as miles per gallon (mpg), horsepower (hp), weight (wt), and the number of cylinders (cyl). I chose this dataset because it provides a good mix of continuous and categorical variables, making it ideal for demonstrating relationships across multiple dimensions. I created a scatter plot using ggplot2 in R, plotting horsepower (hp) on the x-axis and miles per gallon (mpg) on the y-axis. To incorporate additional variables, I used color to represent the number of cylinders ( cyl ), point size to indicate car weight ( wt ), and faceting to separate cars by the number of gears ( gear ). This design allows viewers to quickly see how fuel efficiency decreases as horsepower and weight increase, and how this relationship varies across cars with different numbers of gears and cylinders. This multivariate visualization effectively highlights the trade-o...

Module # 8 Correlation Analysis and ggplot2

Image
  # Step 1: Load Libraries library(ggplot2) library(gridExtra) # Step 2: Load Dataset data(mtcars) head(mtcars) # Step 3: Explore Relationships (Correlation) # Compute correlations between MPG and other numeric variables cor_matrix <- cor(mtcars) print(cor_matrix) # Look specifically at correlation between mpg and wt cor_mpg_wt <- cor(mtcars$mpg, mtcars$wt) cat("Correlation between MPG and Weight:", cor_mpg_wt, "\n") # Step 4: Build a Linear Regression Model model <- lm(mpg ~ wt, data = mtcars) summary(model) # Step 5: Visualizations # A. Simple regression plot for MPG vs Weight p1 <- ggplot(mtcars, aes(x = wt, y = mpg)) +   geom_point(color = "steelblue", size = 3) +   stat_smooth(method = "lm", se = TRUE, color = "darkred", linewidth = 1) +   labs(title = "Relationship Between Weight and MPG",        x = "Weight (1000 lbs)",        y = "Miles per Gallon (MPG)") +   theme_minimal(base_size = 14...

Module #7 Assignment: Visualizing Distributions in R

Image
 # Load required libraries # ------------------------------- library(ggplot2) # ------------------------------- # Step 1: Load and Inspect Dataset # ------------------------------- data("mtcars") # View first few rows head(mtcars) # Check structure str(mtcars) # ------------------------------- # Step 2: Histogram of MPG # ------------------------------- hist_mpg <- ggplot(mtcars, aes(x = mpg)) +   geom_histogram(binwidth = 2, fill = "steelblue", color = "black") +   labs(title = "Distribution of Miles per Gallon (mpg)",        x = "Miles per Gallon",        y = "Count") +   theme_minimal() # Display plot print(hist_mpg) # Save plot as image ggsave("hist_mpg.png", plot = hist_mpg, width = 6, height = 4, dpi = 300) # ------------------------------- # Step 3: Density Plot of Horsepower by Cylinder # ------------------------------- density_hp <- ggplot(mtcars, aes(x = hp, fill = factor(cyl))) +   geom_density(alpha...

Module 6 assignment

Image
  For this assignment, I used the built-in dataset mtcars in Rstudio to explore differences in fuel efficiency across the number of engine cylinders. I created a boxplot to compare the distribution of MPG values by cylinder category (4, 6, and 8 cylinders). The boxplot clearly reveals differences in fuel efficiency between cars with different cylinder counts. Cars with 4 cylinders show the highest mpg, while 8-cylinder cars have lower values. This visual pattern shows the principle of spotting differences , as the distinct box heights and median lines make it easy to see the variation across groups. From a deviation analysis perspective, the visualization shows how cars with 6 and 8 cylinders deviate below the general mean mpg across the dataset. This aligns well with the ideas discussed in the class material that talks about clarity and minimalism in displaying differences. One challenge I faced was adjusting the boxplot labels and colors to make the visu...