Visit Website

Data Science Curriculm Plan

Week Day Topics
Week 1 Day 1-2 Introduction to Data Science: Prelude, The Problem Landscape, Defining Data Science, Demystifying Data Science, Overview of Data Scientist’s Toolbox
Day 3-4 Python Basics: Installation, Setup, Data Types, Functions, Data Manipulation & Visualization with Matplotlib and Seaborn
Day 5-6 Probability and Statistics: Descriptive Statistics, Probability Distributions (Gaussian, Bernoulli, Binomial)
Day 7-8 Holiday
Week 2 Day 9-10 Probability & Statistics: Dispersion (Variance, Standard Deviation), Hypothesis Testing (Z-test, t-test, Chi-square), Correlation Analysis (Pearson, Spearman)
Day 11-12 Introduction to Numpy: Array Operations, Random Data Generation
Day 13-14 Pandas: Importing Datasets, Data Wrangling, Cleaning, EDA
Day 15-16 Holiday
Day 14 QA Session
Week 3 Day 17-18 SQL for Data Science: Basic Queries, Joins, Subqueries, Aggregation
Day 19-20 SQL: Group By, HAVING, Aggregation Functions, Working with SQL Databases
Day 21-22 Scipy & Seaborn: Advanced Visualization, EDA
Day 23-24 Holiday
Week 4 Day 25-26 Principles of Information Visualization, Applied Visualizations (Matplotlib, Seaborn)
Day 27-28 Tableau Basics: Loading Data, Creating Charts, Basic Visual Analysis
Day 29-30 EDA & Hypothesis Testing: Feature Engineering, Statistical Inference
Day 31-32 Holiday
Day 30 QA Session
Week 5 Day 33-34 Machine Learning Introduction: Supervised vs Unsupervised Learning, Overview of Clustering, Classification, and Regression
Day 35-36 Linear Regression: Best Fit Line, Model Training, Model Evaluation
Day 37-38 Logistic Regression: Sigmoid Curve, Model Evaluation
Day 39-40 Holiday
Week 6 Day 41-42 Support Vector Machine (SVM): Kernel Trick, Hyperplanes, Model Evaluation
Day 43-44 K-Nearest Neighbors (KNN): KNN Algorithm, Distance Metrics, Model Evaluation
Day 45-46 AutoML for Model Building: Introduction, Tools, Automated Model Optimization
Day 47-48 Holiday
Day 42 QA Session
Week 7 Day 49-50 Unsupervised Learning: Clustering, K-Means Algorithm, Model Evaluation
Day 51-52 Principal Component Analysis (PCA): Dimensionality Reduction, Eigenvectors, Eigenvalues
Day 53-54 Text Mining with NLTK: Text Preprocessing, Regex, Text Classification
Day 55-56 Holiday
Week 8 Day 57-58 Machine Learning Web App Development with Streamlit: Building Interactive ML Apps, Deploying Models
Day 59-60 FastAPI for ML Deployment: API Building, Asynchronous Processing, Deployment & Scaling


Assignment Plan

Week Assignment Description Due Date Skills Covered
Week 1 Assignment 1: Data Science Fundamentals - Write a report on the role of a Data Scientist, including a comparison between Data Science, AI, ML, and DL. - Complete a basic Python exercise: data types, loops, functions, and basic file handling. Day 6 - Data Science Concepts - Python Basics (Variables, Functions, Data Types)
Week 2 Assignment 2: Probability & Statistics - Implement basic statistical tests (z-test, t-test, Chi-square) using Python. - Conduct a descriptive analysis (mean, median, standard deviation) on a dataset. Day 14 - Hypothesis Testing - Descriptive Statistics - Data Wrangling with Python
Week 3 Assignment 3: SQL Basics - Create a set of SQL queries to filter, sort, and join data from multiple tables. - Perform aggregation on a dataset using SQL. Day 22 - SQL Queries - Data Aggregation and Joins
Week 4 Assignment 4: Data Visualization & Tableau - Create a report on information visualization principles using Matplotlib/Seaborn. - Create a dashboard in Tableau using real-world data (from Excel). Day 30 - Data Visualization (Python, Tableau) - Data Analysis & Reporting
Week 5 Assignment 5: Regression Analysis - Implement Linear Regression in Python to predict an outcome based on features. - Evaluate the model's performance (e.g., R-squared, Mean Absolute Error). Day 38 - Linear Regression - Model Evaluation Techniques
Week 6 Assignment 6: Classification Models - Implement Logistic Regression and KNN (K-Nearest Neighbors) for a classification problem. - Compare models’ performance (e.g., accuracy, precision, recall). Day 46 - Logistic Regression - KNN Algorithm - Classification Metrics
Week 7 Assignment 7: Unsupervised Learning - Implement K-Means clustering on a dataset. - Use PCA (Principal Component Analysis) for dimensionality reduction and explain the results. Day 54 - K-Means Clustering - PCA (Dimensionality Reduction)
Week 8 Assignment 8: Text Mining & Web App - Implement a text classification task using NLTK or Regex. - Build a basic ML web app using Streamlit, where users can input data for model predictions. Day 60 - Text Mining (NLTK, Regex) - ML Web App Development (Streamlit)

Assignments :

Week 1 Assignment - Data Science Fundamentals

  • Task: Report on Data Science vs. AI, ML, DL.

    • Write a comparison paper highlighting key differences.

  • Python Exercise: A simple program to manipulate data and implement basic Python concepts.

    • Example: Build a small Python script that reads a file, processes some data, and outputs a summary of statistics.

Week 2 Assignment - Probability & Statistics

  • Task: Implement hypothesis testing using Python:

    • Use libraries like SciPy to implement z-tests, t-tests, and Chi-square tests.

  • Descriptive Statistics: Work with datasets to calculate mean, median, mode, and standard deviation.

    • Example: Download a dataset and analyze central tendency, dispersion, and test hypotheses.

Week 3 Assignment - SQL Basics

  • Task: Write SQL queries to perform different tasks:

    • Task 1: Filter, sort, and join datasets from multiple tables.

    • Task 2: Use SQL aggregation functions like SUM, AVG, COUNT to analyze data.

Week 4 Assignment - Data Visualization & Tableau

  • Task 1: Write a report on data visualization principles (chart types, color usage, etc.).

  • Task 2: Create a Tableau dashboard that displays key insights from a dataset (e.g., sales data, customer data, etc.).

Week 5 Assignment - Regression Analysis

  • Task: Implement Linear Regression:

    • Train and evaluate the model using Python (using scikit-learn).

    • Task: Evaluate the model using metrics like R-squared and Mean Absolute Error.

Week 6 Assignment - Classification Models

  • Task 1: Build a Logistic Regression model for binary classification (e.g., predict if a customer will buy a product based on features).

  • Task 2: Build a K-Nearest Neighbors (KNN) model and evaluate its performance.

Week 7 Assignment - Unsupervised Learning

  • Task 1: Implement K-Means clustering on a dataset to identify clusters.

    • Example: Use a customer dataset to segment customers into groups.

  • Task 2: Use PCA to reduce the dimensionality of the dataset and visualize the results.

Week 8 Assignment - Text Mining & Web App

  • Task 1: Implement text mining techniques using NLTK or Regex:

    • Perform text cleaning, tokenization, and classification tasks.

  • Task 2: Build a Streamlit web app that takes user input (like text or numbers), runs the model, and displays predictions.

Key Considerations for Assignments:

  • Submission Format: Most assignments will be submitted as reports (PDF or Jupyter Notebooks) or code (Python scripts).

  • Evaluation Criteria:

    • Correctness and efficiency of the code.

    • Clarity of explanations and reports.

    • Data analysis skills (EDA, feature selection, visualization).

    • Proper evaluation of machine learning models (accuracy, precision, recall, etc.).

  • Extensions: Some assignments will have bonus tasks for more advanced challenges, such as working with larger datasets or using more advanced techniques like hyperparameter tuning.

Post a Comment

Visit Website
Visit Website
Mausam Welcome to WhatsApp chat
Hello! How can we help you today?
Type here...