Home > Topics > Data Mining and Business Intelligence > Data Mining Functionalities

Data Mining Functionalities 🛠️💡

What can data mining actually do? In the technical world, these are called "functionalities." Each functionality answers a different type of business question.


Loading stats…


1. Characterization and Discrimination

These functionalities are used to describe the data in a summarized format.

  • Data Characterization: Summarizing the common traits of a target class (e.g., "Our top 1% customers are typically aged 35-45, live in urban areas, and spend ₹50,000+ monthly").
  • Data Discrimination: Comparing the target class with one or more contrasting classes (e.g., "How do 'Loyal Customers' differ from 'One-time Buyers' in terms of their interest in discounts?").
  • The Output: The results are presented as charts, multidimensional curves, or data cubes.
  • Business Use: Understanding the profile of your most profitable products or regions.

2. Mining Frequent Patterns & Associations

Discovering relationships between items that occur frequently together in a single transaction or session.

  • Association Rule Discovery: Identifying "If-Then" patterns (e.g., "If a customer buys a Printer, there is an 80% chance they will buy Ink Cartridges").
  • Support and Confidence: These metrics measure how reliable a rule is. High support means the pattern is common; high confidence means it's a strong predictor.
  • Market Basket Analysis: Used by retailers to decide product placement (e.g., placing snacks near the cold beverages).
  • Correlation Analysis: Finding if two variables move together (e.g., as temperature rises, sales of ice cream increase).

3. Classification

Predicting a category (label) for new, unlabeled data based on patterns learned from historical labeled data.

  • Supervised Learning: The algorithm is "trained" using data where the answer is already known (Training Set).
  • Binary Classification: Answers "Yes/No" questions (e.g., "Is this email Spam or Not Spam?").
  • Multi-class Classification: Assigning data to one of several categories (e.g., "Is this news article about Sports, Tech, or Finance?").
  • Model Accuracy: Tested using a "Test Set" of data to see how many times the model guesses correctly.

4. Cluster Analysis

Discovering natural groups in the data when there are NO pre-defined labels.

  • Unsupervised Learning: The computer groups items based purely on their similarity or distance from each other.
  • Intra-class Similarity: Items within the same group should be as similar as possible.
  • Inter-class Dissimilarity: Different groups should be as different from each other as possible.
  • Customer Segmentation: Identifying groups like "Occasional Luxury Buyers" vs. "Regular Budget Shoppers" without manually tagging them.

5. Prediction (Regression)

Predicting future continuous values (numbers) rather than categories.

  • Trend Analysis: Predicting the future value of a stock or a house based on historical growth rates.
  • Numerical Estimation: Estimating the exact revenue for the next quarter based on current marketing spend.
  • Evolutionary Analysis: Studying how patterns change over long periods (e.g., how the popularity of a fashion style rises and falls).

Loading diagram…


Summary

  • Characterization: Summarizes traits.
  • Association: Finds co-occurring items.
  • Classification: Predicts categories (Supervised).
  • Clustering: Finds natural groups (Unsupervised).
  • Prediction: Projects future values.

Quiz Time! 🎯

Loading quiz…