Splunk 8.0 for Analytics and Data Science
Upcoming Classes
Online
Instructor-led online training
Location | Feb 2023 | Mar 2023 | Apr 2023 | May 2023 | Jun 2023 | Jul 2023 | Aug 2023 |
---|---|---|---|---|---|---|---|
EMEA UK Time - Virtual |
Feb 1 – Feb 3 |
Mar 1 – Mar 3 Mar 29 – Mar 31 |
Apr 26 – Apr 28 | ||||
AMER Eastern Time - Virtual |
Feb 15 – Feb 17 |
Mar 22 – Mar 24 |
Apr 19 – Apr 21 | ||||
APAC Singapore - Virtual |
Mar 1 – Mar 3 | ||||||
AMER Pacific Time - Virtual |
Mar 15 – Mar 17 |
Apr 26 – Apr 28 |

Summary
This 13.5 hour course is for users who want to attain operational intelligence level 4, (business insights) and covers implementing analytics and data science projects using Splunk's statistics, machine learning, built-in and custom visualization
capabilities.
Description
- Analytics Framework
- Exploratory Data Analysis
- Regression for Prediction
- Cleaning and Preprocessing and Feature Extraction
- Algorithms, Preprocessing and Feature Extraction
- Clustering Data
- Detecting Anomalies
- Forecasting
- Classification
Objectives
Module 1 – Analytics Workflow
- Define terms related to analytics and data science
- Define the analytics workflow
- Describe common usage scenarios
- Navigate Splunk Machine Learning Toolkit
Module 2 – Exploratory Data Analysis
- Describe the purpose of data exploration
- Identify SPL commands for data exploration
- Split data for testing and training using the sample command
Module 3 – Predict Numeric Fields with Regression
- Differentiate predictions from estimates
- Identify prediction algorithms and assumptions
- Describe the fit and apply commands
- Model numeric predictions in the MLTK and Splunk Enterprise
- Use the score command to evaluate models
Module 4 – Clean and Preprocess the Data
- Define preprocessing and describe its purpose
- Describe algorithms that preprocess data for use in models
- Use FieldSelector to choose relevant fields
- Use PCA and ICA to reduce dimensionality
- Normalize data with StandardScaler and RobustScaler
- Preprocess text using Imputer, and NPR, TF-IDF, HashingVectorizer and the cluster command
- Define Clustering
- Identify clustering methods, algorithms, and use cases
- Use Smart Clustering Assistant to cluster data
- Evaluate clusters using silhouette score
- Validate cluster coherence
- Describe clustering best practices
- Define anomaly detection and outliers
- Identify anomaly detection use cases
- Use Splunk Machine Learning Toolkit Smart Outlier Assistant
- Detect anomalies using the Density Function algorithm
- Optimize anomaly detection with the Local Outlier Factor
- View results with the Distribution Plot visualization
- Differentiate predictions from forecasts
- Use the Smart Forecasting Assistant
- Use the StateSpaceForecast algorithm
- Forecast multivariate data
- Account for periodicity in each time series
- Define key classification terms
- Use classification algorithms
- AutoPrediction
- LogisticRegression
- SVM (Support Vector Machines)
- RandomForestClassifier
- Evaluate classifier tradeoffs
- Evaluate results of multiple algorithms