Setup Menus in Admin Panel

Reachout Analytics

Setup Menus in Admin Panel

Data Science New Batch Starts on 15 Apirl, 2017

Data Science

 Data Science
(With SAS, R, WEKA, and SPSS & Excel)*

Module1: Introduction Data Science

Part -1 Referential details for Data science Business Analytics

 Scope & Fact of Data Science and Business analytics
 SWOT Analysis of Data Science Business Analytics
 Introduction to Advanced Data Analytics
 Journey Mathematics-Statistics-Econometrics
 Flow chart for Data Science and Business Analytics
 Data wherehouse conceptual discussions
 Hadoop for Data Science
 OLTP OLAP for Data information
 Web Application report

Module 2: Data Visualization and Summarization

Part-2: Descriptive Statistics:

 Descriptive Statistical
 Inferential Statistics
 Types of Variables
 Measures of central tendency
 Data Viability Dispersion
 Five number Summary Analysis
 Data Distribution Techniques
 Exploration Techniques for Numerical data
 Exploration techniques for Character Data
 Visualization Exploration
 Summary Exploration
 Chebychev’s Inequality.

Part-3: Basic Probability for Business Issues:

 Simple Probability
 Marginal Probability
 Joint Probability
 Conditional probability (linked with decision Tress Algorithms)
 Bayes’ Theorem probability (linked with Naïve Bayes Algorithms)
 Discrete Distributions
 Binomial Distribution
 Hypergeomatric Distributions
 Poisson Distribution
 Continuous Distributions
 Normal Distribution and Properties
 Scandalized Distributions

Part-4: Sampling Techniques Big Data

 Sampling Distributions
 Simple Random
 Systematic Sample
 Stratified sample
 Cluster Sample
 Standard Error of the Mean
 Skewed Std. Error
 Kurtosis Std. Error
 Central Limit Theorem,
 Sampling from Infinity
 Sampling Distributions for Mean
 Sampling Distributions for proportions

Module 3: Data Preparation and Quality Check

Part-5: Data Validation Data Normality

 Unvariate normality techniques
 Bivariate techniques
 Multivariate techniques
 Q-Q probability plots
 Cumulative frequency
 Explorer analysis
 Steam and leaf analysis
 Histogram
 Box plot
 Scores for Normality Check
 Kolmogorov Smirnov test
 Shapiro Wilks test
 Anderson darling test

Part – 6 Data Cleaning process Quality check

 PCA for Big Data Analysis or Unsupervised data
 PCA Regression Scores for Supervised aata
 Noise Data detecting
 Data cleaning with Regression Residual
 Data Scrubbing with statistical sense
Part-7: Data Imputation and outlier treatment
 Outlier treatment with robust measurements
 Outlier treatment with central tendency Mean
 Outlier with Min Max Likelihood methods
 Outlier Detection with Density Based
 Visualize Outlier Treatment
 Summarized Outlier Treatment
 Multivariate Outlier Detection Mahalanobis Distance
 Multivariate Chi-square statistics
 Outlier with Residual Analysis
 Outlier Detection with PCA Analysis
 Data Imputation with series Central Tendency

Part-8: Test of Hypothesis

 Null Hypothesis formulation
 Alternative Hypothesis
 Type I and Type II errors
 Power Value
 One tail and Two tail
 One Sample T-TEST
 Paired T-TEST
 Independent Sample T-TEST
 Analysis of Variance ( ANOVA),
 MANOVA
 Chi Square Test
 Kendall Chi Square
 Kruskal-Wallis Rank Test Chi Square
 Mann-Whitney, Chi Square
 Wilcoxon, Chi Square
 McNemar test Chi Square

Part-9: Data Transformation

 Log transformation
 Arcsine transformation
 Box- Cox transformation
 Square root transformation
 Inverse transformation
 Min Max Data normalization

Module 4: Predictive & Estimation Models (Supervised earning)

Part-10: Predictive modeling & Diagnostics

 Correlation – Pearson, Kendall, Wilcox
 SLR Regression
 MLR Regression
 Examination Residual analysis
 Auto Correlation
 Test of ANOVA Significant
 VIF Analysis
 Test of Ttest Significant
 CP Indexing
 Eigen Value for PCA Analysis
 Homoscedasticity
 Heteroskedasticity
 Stepwise regression
 Forward Regression
 Backward Regression
 Multicollinearity
 Cross validation
 MAPE
 Check prediction accuracy
 Standized regression
 Quadraint Regression
 Transformed Regression
 Dummy Variables Regression

Part-11 Logistic Regression Analysis

 Logistic Regression
 Discriminate Regression Analysis
 Multiple Discriminant Analysis
 Stepwise Discriminant Analysis
 Logit function
 Test of Associations
 Chi-square strength of association
 Binary Regression Analysis
 Profit and Logit Models
 Estimation of probability using logistic regression,
 Wald Test statistics for Model
 Hosmer Lemshow
 Nagurkake R square
 Pseudio R square
 Maximum likelihood estimation
 Model Fit
 Model cross validation
 Discrimination functions
 AIC
 BIC (Bayesian information criterion)

Module 5: Advanced Big Data Analytics

Part-12: Dimension Reduction Analysis

 Introduction to Factor Analysis
 Principle component analysis
 Reliability Test
 KMO MSA tests, Eigen Value Interpretation,
 Rotation and Extraction steps
 Varmix Models
 Conformity Factor Analysis
 Exploitary Factor Analysis
 Factor Score for Regression

Part-13: Cluster Analysis

 Introduction to Cluster Techniques
 Hierarchical clustering
 K Means clustering
 Wards Methods
 Agglomerative Clustering
 Variation Methods
 Maximum distance Linkage Methods
 Centroid distance Methods
 Minimum distance Linkage Method
 Cluster Dengogram,
 Ecludin distance method s

Module 6: Data Mining (Machine Learning)

Part -14: Data Mining Machine Learning / Artificial Intelligence Functional Models

 Prediction
 Support Vector Machines (SVM)
 Gaussian Models
 Neural Network
Classification Models
 Binary Regression/Logit Model
 Probit Model
 Na¨ıve Bayes
 Na¨ıve Bayes Multinomial
 Ordinal Regression
 Multinomial Regression
 Discriminate analysis
Clustering Models
 DBSCAN
 EM (Expectation Maximization)
 K-Means Clustering
 Simple Cluster
 Hierarchical Cluster
 k-Nearest Neighbor Classification
Tree Models
 Random Forests :Bagging & Boosting
 Decision Stump
 CHAID Analysis
 C4.5 / C5.0
 J48 Pronning, Unproning
 Decision trees
Suvervial Analysis
 Mantel—Haenszel Test
 Kaplan-Meier (Product- Limit) Estimator
 Cox’s Proportional Hazards Model
 Cox—Snell Residual
 Hazard Functions
 Proportional Hazards Assumption

Part-15 Time series

Auto Regression Models
Moving Average Model
Multiplicative model
ARMA Model
Additive Model

Part-16 Model Validation and Testing

 Kappa Statistics
 AIC
 BIC
 Error/ Confusion matrices
 ROC
 APE
 MAPE
 Lift Curve
 Sensitivity
 Misclassification Rating
 Specificity
 Maximum Absolute Error
 Root Final Prediction Error
 Gini Coefficient
 Schwarz’s Bayesian Criterion

Part-17 Hadoop Ecosystem (Big Data Handling)
Pig
Hive
MapReduce
Mahount
NoSQL

Note: * Open source Tools are available, commercial tools(SAS SPSS )we are using trail versions

77-A Journalist Colony, Andhra jyothy Lane , Jubilee Hills, Hyderabad – 500033, India
www.robaservices.com, www.reachoutanalytics.com

Land line +91 40 32910202,

Mobile +91 9700213845

0 Responses on Data Science New Batch Starts on 15 Apirl, 2017"

Leave a Message

Skip to toolbar