Our Services

Get 15% Discount on your First Order

follow the insrtuction Group Project Mostafa Rezaei Big Data (Introduction to Data Science) General information The group project gives you

April 10, 2025

follow the insrtuction

Group Project

Mostafa Rezaei

Big Data (Introduction to Data Science)

General information

The group project gives you the opportunity to practice many of the skills we learned in the class.

It includes 5 steps:

Step 1: Find and describe a data set

Find a publicly available data set. The data set should not be from the UCI ML Repository or any other
data set commonly used in ML competitions. Open Data Initiative websites are also good places to find
data sets, for example:

•
•
•

Once you find a dataset, you should figure out:

• the individual or organization that created it;
• the purpose of its creation;
• its terms of use;
• how the data was collected and the sampling procedure;
• the definitions of the variables and their units.

Step 2: Perform an initial EDA

Perform an initial EDA, where you create plots and group summaries to understand the variation of each
variable, including typical values, clusters, outliers, missing values, etc.

Step 3: Perform an in-depth EDA

In this step, you should perform an in-depth EDA in order to discover interesting covariations and patterns
in the data.

As discussed in the class, you should go through an iterative cycle of asking questions about the data and
finding answers to the questions using data transformation and visualization. Investigate the answers you
obtain with curiosity and skepticism and follow-up with further (more detailed) questions.

Step 4: Build a prediction model

In the final step, you should think about an interesting prediction problem using your dataset. Think about
the details of the prediction model, including

• the response variable and predictor variables;
• the evaluation metrics;
• how you will conduct CV to estimate the out-of-sample performance of the model and to tune the

hyper-parameters of the model.

Step 5: Present your findings!

The value of your research is limited if you keep it to yourself. So in this step you will polish your most
interesting findings in a presentations for the world to see. See below for details.

Deliverables

1. Create two plots that visually represent your most notable findings from your EDA:

• The plots should be created using ggplot2
• Ensuring they are polished and self-contained with meaningful titles, subtitles, labels, and captions
• Save each plot separately as an PNG or PDF file
• For preparing plots for communication, see

2. Build a prediction model and calculate its out-of-sample performance

• Specify the response and predictor variables
• Specify the evaluation metrics used
• Specify how you perform CV to obtain an estimate of its out-of-sample performance and to tune

its hyper-parameters

3. Provide a R Notebook file (with an extension .Rmd) containing your code and code outputs, such as
plots and tables

• Output a HTML file from the notebook, ensuring it correctly displays all code and outputs
• Use minimal comments in your code and follow the Tidyverse style guide:

org/index.html
• On the first line of the R Notebook, briefly list the contributions of each team member

4. Record a 5-minute presentation of your work

• Create a 5-slide presentation
• Slide 1: Contextual information about the dataset
• Slide 2: A detailed description of the dataset
• Slides 3 and 4: The two plots showcasing your primary EDA findings
• Slide 5: Information and results of your trained prediction model
• Your recorded presentation should not exceed 5 minutes

Submission details

• Upload your work on the dedicated assignment for the group project on BB
• Only one person per group needs to submit their group’s work
• Compress all your files into a ZIP file, containing

org/index.html

– the 2 PNG or PDF files of the EDA plots
– Your R Notebook, with an extension .Rmd
– The HTML file created from your R Notebook
– Your presentation file in PDF format
– The recording of your presentation

General information

Step 1: Find and describe a data set
Step 2: Perform an initial EDA
Step 3: Perform an in-depth EDA
Step 4: Build a prediction model
Step 5: Present your findings!

Deliverables
Submission details

>Computer Science homework help

Share This Post

Order a Similar Paper and get 15% Discount on your First Order

Related Questions

Survey Instrument Bibilography Human Comput inter & Usability Survey Instrument One of the ways in

Survey Instrument Bibilography Human Comput inter & Usability Survey Instrument One of the ways in which usability professionals collect data, and for that matter academic professionals, is the use of a survey instrument. In this assignment, you’ll create a paper-based survey instrument evaluating a mobile application or Website. In the

create an ERD with oracle data modeler DROP TABLE gym_user; DROP TABLE administration; DROP TABLE FAQ; DROP TABLE feedback; DROP TABLE

create an ERD with oracle data modeler DROP TABLE gym_user; DROP TABLE administration; DROP TABLE FAQ; DROP TABLE feedback; DROP TABLE schedule; DROP TABLE crowd_meter; CREATE TABLE gym_user ( user_id INT(8) NOT NULL, fname VARCHAR(15) NOT NULL, lname VARCHAR(15) NOT NULL, email VARCHAR(50) NOT NULL, address VARCHAR(400) NOT NULL, phone_num

in the file 1..I need two responses for each question as you provide last time like they are answering an interview This is for my field

in the file 1..I need two responses for each question as you provide last time like they are answering an interview This is for my field test. 2….And couple of statements 2 paragraphs should be enough thanks. Like if you achieved what you want through the responses from this field

Quantavis Neely 595 Thicket Run York, South Carolina 29745 neelytavis33 @gmail.com (803) 524-4909

Quantavis Neely 595 Thicket Run York, South Carolina 29745 neelytavis33 @gmail.com (803) 524-4909 Objective: To secure a professional position in a company that seeks an ambitious and career conscious individual where acquired skills will be utilized toward growth, and expand upon my learnings, knowledge, and competencies. Skills: Ability to work

Follow the attach instructions to complete this work. Note: Make sure it aligns with the Rubric Draft/Outline Top of

Follow the attach instructions to complete this work. Note: Make sure it aligns with the Rubric Draft/Outline Top of Form Instructions For the Final Project, you will select at least one film that represents your career or career goals. For example, if you are working toward a cybersecurity degree or already

statement query You are required to complete Exercises 1 through 7 from Chapter 19. For each exercise, you need to submit specific files. Here’s

statement query You are required to complete Exercises 1 through 7 from Chapter 19. For each exercise, you need to submit specific files. Here’s what you need to do: Exercise 1: Submit the backup script file as CIS276DA_Lesson12Backup_JAM2334209. Exercise 4: Submit the SQL file as CIS276DA_Lesson12Exercise4_ JAM2334209.sql and the CSV

Statements query You are required to complete the exercises 1-15 from Chapter 18. Save the script from #2 as

Statements query You are required to complete the exercises 1-15 from Chapter 18. Save the script from #2 as CIS276DA_Lesson11Exercise2_JAM2334209.sql Save the script from #5 as CIS276DA_Lesson11Exercise5_ JAM2334209.sql Save the output from #5 as CIS276DA_Lesson11Exercise5_ JAM2334209.csv Save the script from #6 as CIS276DA_Lesson11Exercise6_ JAM2334209.sql Save the output from #6 as

Follow the attach instructions to complete this work. Note: Make sure it aligns with the attach Rubric. A project to do on

Follow the attach instructions to complete this work. Note: Make sure it aligns with the attach Rubric. A project to do on Movie Instructions For the Final Project, you will select at least one film that represents your career or career goals. For example, if you are working toward a cybersecurity

Hi, I need help. I created a new website and added a referral earning option, but it’s not working. When a new user creates an account using a referral

Hi, I need help. I created a new website and added a referral earning option, but it’s not working. When a new user creates an account using a referral code, the account is created, but the referral is not recorded for either the referrer or the new user. Sometimes, it

follow the attach document to complete this work. Please note: Make sure it aligns with the attach rubric. Project 2: Compare / Contrast Two

follow the attach document to complete this work. Please note: Make sure it aligns with the attach rubric. Project 2: Compare / Contrast Two State Government IT Security Policies For this research-based report, you will perform a comparative analysis that examines the strengths and weaknesses of two existing IT Security

Follow the attach instructions to complete this work. Please make sure it aligns with the rubric. Project 1: Cybersecurity for OPEN

Follow the attach instructions to complete this work. Please make sure it aligns with the rubric. Project 1: Cybersecurity for OPEN Data Scenario: A federal agency has asked your cybersecurity consulting firm to provide a research report examining Open Data services’ usefulness and security issues. The report is intended for

Module Code: UFCFP4-30-1 Student id-24059073 Date: March 2025 Module Details · Module Name: Computer Crime and

Module Code: UFCFP4-30-1 Student id-24059073 Date: March 2025 Module Details · Module Name: Computer Crime and Digital Evidence · Module Code: UFCFP4-30-1 Section 1: Overview of Assessment Students are entrusted with performing comprehensive evaluations of forensic tools, applying them in practical scenarios, and meticulously documenting their observations. The key emphasis

SQL statements In this project, you will create a Node.js application that interacts with a MySQL database. You’ll practice executing various

SQL statements In this project, you will create a Node.js application that interacts with a MySQL database. You’ll practice executing various types of queries, such as INSERT, SELECT, UPDATE, and DELETE, to manage employee records in the database. Instructions: Setup: Make sure you have Node.js and MySQL installed on your

From: Bannffield Pet Hospital ( [email protected]) To: Optimumm Wellness Plan Members Subject: CANCELLATION OF OPTIMUMM

From: Bannffield Pet Hospital ( [email protected]) To: Optimumm Wellness Plan Members Subject: CANCELLATION OF OPTIMUMM WELLNESS PLAN SUSPENSION OF Optimumm Wellness Plan (OWP) Dear Customer We are writing to inform you that the payment information for your pet’s Optimumm Wellness Plan (OWP) was unable to be processed. In order to

SQL statements help You are required to complete exercises 8, 9, 10, and 11 from Chapter 3. For each exercise, you need to create SQL statements a

SQL statements help You are required to complete exercises 8, 9, 10, and 11 from Chapter 3. For each exercise, you need to create SQL statements and related CSV files. Here’s what you need to do: · Exercise 8: Write your SQL statement in a file named CIS276DA_Lesson3Exercise8_ JAM2334209.sql. Run

Help statements Complete Exercises 1, 2, and 5 from Chapter 11. Complete Exercises 1, 2, 3, and 5 from Chapter 12. Write your script in a file

Help statements Complete Exercises 1, 2, and 5 from Chapter 11. Complete Exercises 1, 2, 3, and 5 from Chapter 12. Write your script in a file named CIS276DA_Lesson9Chapter#Exercise#_JAM2334209.sql. Run the script and export the results to CIS276DA_Lesson9Chapter#Exercise#_JAM2334209.csv.

Follow the attach document to complete this work. Questions: 1. What is Generative AI and how is it similar/different to

Follow the attach document to complete this work. Questions: 1. What is Generative AI and how is it similar/different to Traditional AI? 2. Do you believe that work created by Generative AI (e.g. ChatGPT) is comparable in quality to human created content? What challenges and opportunities does Generative AI pose to

Follow the attach document to complete this work. Questions: 1. How is technology increasingly used in healthcare beyond

Follow the attach document to complete this work. Questions: 1. How is technology increasingly used in healthcare beyond electronic health records (HER)? Give some examples. 2. Would you personally participate in robot assisted/telesurgery as a patient or a medical professional? Why or why not? Resources 20 Examples Of IoT Wearables