Final Project LIS4317

LIS4317 Final Project – Diamonds Dataset Analysis

 Kyla Garcia
Platform: Tableau
Dataset: Diamonds (R built-in, 53,940 observations, 10 variables)
Date: November 26, 2025


 

1. Introduction

The prices of diamonds differ according to its Four Cs: Cut, Clarity, Color and Carat. This project discusses the impact of each factor on price and the predictors with the highest strengths.
Research Question: To what extent do diamond characteristics predict and influence diamond price?

 

2. Dataset Overview

Variable

Description

Carat

Weight of the diamond

Cut

Cut quality (Fair, Good, Very Good, Premium, Ideal)

Color

D–J, from colorless to near-colorless

Clarity

I1 → IF (low to high clarity)

Depth

Total depth %

Table

Width ratio

Price

Price in USD

x, y, z

Dimensions in mm

Note: Dataset has 53,940 rows and 10 columns.

 

3. Methodology

Import diamonds.csv into Tableau.

Create five visualizations:

Price Distribution (Histogram)

Price vs Cut (Box Plot)

Price vs Clarity (Box Plot / Dot Plot)

Price vs Color (Box Plot / Dot Plot)

Carat vs Price (Scatter Plot with Trend Line)

Interpret medians, sums, and trends.

Build a dashboard summarizing insights.

 

4. Visualizations & Analysis

Chart 1: Price Distribution

Observation: Right-skewed; most diamonds $500–$2,500, few expensive outliers.

 

Chart 2: Price vs Cut

Observation: Ideal cut has the highest median price, followed by Premium and Very Good.
Price vs Cut (Box Plot)

 

Chart 3: Price vs Clarity

Observation: SI1 clarity has the highest total sum; higher clarity grades like IF/VVS less frequent, lowering aggregate totals.
Price vs Clarity (Box Plot)

Price vs clarity showing market-preferred grades dominate total revenue.

 

Chart 4: Price vs Color

Observation: G color grade has highest total sum, reflecting popularity rather than per-carat value.
Price vs Color (Box Plot)
Price vs color highlighting popular near-colorless grades.

 

Chart 5: Carat vs Price

Observation: Strong exponential relationship; larger diamonds are rarer and priced disproportionately higher. Trend line confirms non-linear correlation.
Carat vs Price (Scatter Plot with Trend Line)
Scatter plot showing price increases exponentially with carat.

Carat weight is the strongest predictor of price, followed by cut quality. Market-preferred clarity and color grades dominate total revenue. Insights reveal both consumer preferences and pricing trends.

 

6. Conclusion

Carat weight triggers the most influential price influence. The quality of cut has a greater influence on median prices. Aggregate price amounts exhibit mid-range transparency (SI1) and color (G) prevails in revenue as it is popular in the market. The analysis has given a clear picture of the impact of diamond qualities on the prices.



  

Comments

Popular posts from this blog

Module 11 assignment