Data Science SAS
Great skills are necessary to undertake an intensive and challenging Course that is taught and examined with Multiple Tools. To help students gain the necessary tool skills, we can provide Data Science course training with tools that can help them improve their course proficiency. Students can find course details here with hands on 15 + case studies List of Case Studies. Altogether 250+ hours of programme taught by expertise with 20 years of through experience in the field of data science.
Module1 1
Foundations of Date Science: Data Visualization and Interpretation
Part -1 Referential details for Data science Business Analytics
Scope & Fact of Data Science and Business analytics
SWOT Analysis of Data Science Business Analytics
Introduction to Advanced Data Analytics
Journey Mathematics-Statistics-Econometrics
Flow chart for Data Science and Business Analytics
Data wherehouse conceptual discussions
Hadoop for Data Science
OLTP OLAP for Data information
Web Application report
Part-2: Descriptive Statistics:
Descriptive Statistical
Inferential Statistics
Types of Variables
Measures of central tendency
Data Viability Dispersion
Five number Summary Analysis
Data Distribution Techniques
Exploration Techniques for Numerical data
Exploration techniques for Character Data
Visualization Exploration
Summary Exploration
Chebychev’s Inequality.
Part-3: Basic Probability for Business Issues:
Simple Probability
Marginal Probability
Joint Probability
Conditional probability (linked with decision Tress Algorithms)
Bayes’ Theorem probability (linked with Naïve Bayes Algorithms)
Discrete Distributions
Binomial Distribution
Hypergeomatric Distributions
Poisson Distribution
Continuous Distributions
Normal Distribution and Properties
Scandalized Distributions
Part-4: Sampling Techniques Big Data
Sampling Distributions
Simple Random
Systematic Sample
Stratified sample
Cluster Sample
Standard Error of the Mean
Skewed Std. Error
Kurtosis Std. Error
Central Limit Theorem,
Sampling from Infinity
Sampling Distributions for Mean
Sampling Distributions for proportions
Module 2
Data Preprocessing and Imputation
Part-5: Data Validation Data Normality
Unvariate normality techniques
Bivariate techniques
Multivariate techniques
Q-Q probability plots
Cumulative frequency
Explorer analysis
Steam and leaf analysis
Histogram
Box plot
Scores for Normality Check
Kolmogorov Smirnov test
Shapiro Wilks test
Anderson darling test
Part – 6 Data Cleaning process Quality check
PCA for Big Data Analysis or Unsupervised data
PCA Regression Scores for Supervised aata
Noise Data detecting
Data cleaning with Regression Residual
Data Scrubbing with statistical sense
Part-7: Data Imputation and outlier treatment
Outlier treatment with robust measurements
Outlier treatment with central tendency Mean
Outlier with Min Max Likelihood methods
Outlier Detection with Density Based
Visualize Outlier Treatment
Outlier with Residual Analysis
Outlier Detection with PCA Analysis
Data Imputation with series Central Tendency
Part-8: Test of Hypothesis
Null Hypothesis formulation
Alternative Hypothesis
Type I and Type II errors
Power Value
One tail and Two tail
One Sample T-TEST
Paired T-TEST
Independent Sample T-TEST
Analysis of Variance ( ANOVA),
MANOVA
Chi Square Test
Kendall Chi Square
Kruskal-Wallis Rank Test Chi Square
Mann-Whitney, Chi Square
Wilcoxon, Chi Square
McNemar test Chi Square
Part-9: Data Transformation
Log transformation
Box- Cox transformation
Square root transformation
Inverse transformation
Min Max Data normalization
Module 3
Predictive Analytics: Supervised Learning Algorithms
Part-10: Predictive modeling & Diagnostics
Correlation
SLR Regression
MLR Regression
Examination Residual analysis
Auto Correlation
Test of ANOVA Significant
VIF Analysis
Test of Ttest Significant
CP Indexing
Eigen Value for PCA Analysis
Homoscedasticity
Heteroskedasticity
Stepwise regression
Forward Regression
Backward Regression
Multicollinearity
Cross validation
MAPE
Check prediction accuracy
Standized regression
Quadraint Regression
Transformed Regression
Dummy Variables Regression
Part-11 Logistic Regression Analysis
Logistic Regression
Discriminate Regression Analysis
Multiple Discriminant Analysis
Stepwise Discriminant Analysis
Logit function
Test of Associations
Chi-square strength of association
Binary Regression Analysis
Profit and Logit Models
Estimation of probability using logistic regression,
Wald Test statistics for Model
Hosmer Lemshow
Nagurkake R square
Pseudio R square
Maximum likelihood estimation
Model Fit
Model cross validation
Discrimination functions
AIC
BIC (Bayesian information criterion)
Kappa Statistics
AIC
BIC
Error/ Confusion matrices
ROC
APE
MAPE
Lift Curve
Sensitivity
Misclassification Rating
Specificity
Maximum Absolute Error
Recall
Miss classification
Root Final Prediction Error
Gini Coefficient
Schwarz’s Bayesian Criterion
Module 4
(Advanced Analytics 1) unsupervised Learning Algorithms
Part-12: Dimension Reduction Analysis
Introduction to Factor Analysis
Principle component analysis
Reliability Test
KMO MSA tests, Eigen Value Interpretation,
Rotation and Extraction steps
Varmix Models
Conformity Factor Analysis
Exploitary Factor Analysis
Factor Score for Regression
Part-13: Cluster Analysis
Introduction to Cluster Techniques
Hierarchical clustering
K Means clustering
Wards Methods
Agglomerative Clustering
Variation Methods
Maximum distance Linkage Methods
Centroid distance Methods
Minimum distance Linkage Method
Cluster Dengogram,
Ecludin distance method s
Module 5
Forecasting and Operations Analytics
NAvie Forecsting
Moving Average
Exponecial smoothing
ARIMA
REfere Time series ppt
Auto-Regressive Integrated
Moving Average (ARIMA) models,
ARIMAX.
Conjoint analysis,
Discriminant analysis.
Module 5
(Advanced Analytics 3) Machine Learning Algorithms
Prediction
Support Vector Machines (SVM)
Binary Regression/Logit Model
Probit Model
Na¨ıve Bayes
Na¨ıve Bayes Multinomial
Ordinal Regression
Multinomial Regression
k-Nearest Neighbor Classification
Decision Stump
CHAID Analysis
Recommender Systems,
Collaborative Filtering
Advanced recommender system.
Bootstrap Aggregating (Bagging),
Random forest,
Adaptive boosting,
gradient boosting
Support vector machine
Neural Network
C4.5 / C5.0
J48 Pruning, Uprunning
Decision trees
Module 6
(Advanced Analytics 4) Artificial Intelligence (3 Days)
Introduction to neural networks; rule
based expert systems
Introduction to artificial neural
networks (ANN); Neuron as
computing element; Perceptron:
McCullogh-Pitts model; Backpropagation
algorithm; Multi-layer
Neural Networks
Deep learning algorithms:
Convolutional networks; Recurrent
nets; Auto-encoders;
Deep Learning Platform: H2O.ai;
Dato GraphLab; Tensor Flow
Module 7
(Advanced Analytics 5)
Suvervial Analysis
Mantel—Haenszel Test
Kaplan-Meier (Product- Limit) Estimator
Cox’s Proportional Hazards Model
Cox—Snell Residual
Hazard Functions
Proportional Hazards Assumption
Module 8
(Advanced Analytics 2)
Big Data Analytics
Introduction to BigData
sources of Big Data
Big Data technologies: Hadoop distributed
file system; Employing Hadoop
Statistical Analysis of Big Data.
Pig
Hive
MapReduce
NoSQL
Good Course
Nice Course Ciriculum