Data Science

Data Science

Data science course  Contents

  • Introduction
    1. What is Analytics
    2. Why Analytics
    3. Areas of Applications
    4. Use cases in different industries
    5. Breaking a business problem/situation into numbers
    6. Digitization of the world (Concept of Big data/All data)
    7. Impact of Analytics
    8. Understanding the sources of data
    9. Are there any linkage between the sources?
    10. Solving an Analytics problem – introduction to CRISP-DM
    11. Understanding BU (Business Understanding), DU (Data Understanding), DP (Data Preparation), A&M (Analysis and Modeling), R&E (Results and Evaluation)
  • Data Pre-processing
    1. Big Data/Types of data and sources
    2. Data Analysis
    3. Why pre-process
    4. Handling Outliers
    5. Handling Noise
    6. Handling Missing values
    7. Data smoothing
    8. Data standardization
    9. Data visualization
  • R Programming
    1. Introduction
    2. Why use R programming
    3. Installing R and R studio
    4. The packages paradigm
    5. Data types and structures
    6. Common data operations
    7. Functions in R
    8. Control functions
    9. Plyrpackage Data cleaning, preparation and manipulation
    10. ggplot2 bMake visualization of data easy
    11. reshape package convert wide to long data and vice-versa
  • Model Building
    1. Why to build a model
    2. How to build
    3. Types of models (Supervised and Unsupervised)
    4. Parameters to consider for model building
    5. Select the best model
    6. Train a model
    7. Train and test data
  • Introduction to statistics and Probability
    1. Basics of statistics and probability
    2. Data types
    3. Measures of central tendency
    4. Measures of dispersion
    5. Sampling
    6. Probability concepts with examples
  • Probability Distributions
    1. Introduction
    2. Bernoulli distribution
    3. Geometric distribution
    4. Binomial distribution
    5. Poisson distribution
    6. Normal distribution
    7. Central limit theorem
  • Advanced concepts in statistics
    1. Hypothesis testing
    2. Confidence intervals
    3. Inference
    4. T-statistics
    5. 1-sample t-test
    6. 2-sample t-test
    7. ANOVA
    8. Paired t-test
    9. Chi-square distribution
    10. Chi-square Goodness of fit
    11. Chi-square test of independence
  • Regression
    1. Why use a regression model
    2. The working principle behind a linear regression
    3. Simple linear Regression
    4. Model building process and assumptions in building regression model
    5. Interpretation of coefficients
    6. Model goodness of fit
    7. Dummy variables
    8. Residual analysis
    9. Outliers
    10. Multicollinearity
    11. Leverage and influence
  1. Extending to multiple linear regression
  2. Multi-linear regression
  • Logistic Regression
    1. Introduction to classification models – An example case study
    2. The logit transformation
    3. Applying to binary business decisions
    4. Model building process
    5. Data preparation
    6. Feature extraction
  • Times series analysis
    1. Time series Vs Casual models
    2. Trend, Seasonality, Cyclicity
    3. Moving averages
    4. Exponential smoothing
    5. ARIMA
  • Distance Measures
  • UnSupervised Learning
    1. Clustering (Hierarchical, K-Means, KK-Means, K-Medoid, Spectral)
    2. Association Rule Mining
    3. Market Basket Analysis
  • Dimensionality Reduction Techniques
    1. Principle Component Analysis (PCA)
    2. Singular Value Decomposition (SVD)
  • Supervised Learning
    1. Decision Trees
    2. Neural Nets
    3. Support Vector Machines (SVM)
    4. Random Forest
    5. Ensembling Techniques (Baagging, Boosting, Stacking)
    6. Gradient Boosting Machines (GBM)
  • Text Analytics and Natural Language Processing (NLP)