Customer relationship management (CRM) data mining refers to the process of searching through customer relationship databases and analyzing data on customer behavior gathered. This data helps marketers to better focus their campaigns, which leads to increased customer retention and sales. CRM data mining is also known as data exploration and knowledge discovery. There are two main categories associated with data mining: descriptive analysis and predictive modeling.
Descriptive analysis utilizes segmentation and clustering to better analyze a set pattern of behavior among a particular group of customers. Customers can be grouped according to gender, age, race, and other categories. The main goal of a segment is to provide the marketer with a group of similar customers in order to more effectively mine the data for useful insights.
Clustering aggregates segment groups. Each cluster is mutually exclusive and is characterized by a set of predetermined characteristics. For instance, a cluster could include females ages 18 to 25 who purchased a certain nail polish during the last two weeks of December 2010. This is an example of qualitative method CRM data mining.
In non-exclusionary segments, another form of descriptive analysis, a particular set of customer behavior leads to a completely new set of behaviors. For instance, a group of customers could spend a significant amount of money on spa services, but not spend a lot of money on related services such as hair and salon care. This type of CRM data mining requires more advanced statistical analysis than basic segmentation.
Predictive modeling is the more popular of the two CRM data mining categories. It measures the degree of correlation between two customer behavior factors and the statistical reliability of that correlation. The predictive model is built using a data mining application which assigns scores to each customer, indicating the likelihood that the customer will behave in the same way in the future. For example, the model can help a marketer to determine the probability that a married male customer between the ages of 31 and 42 with children will purchase a particular brand of lawn mower within the next six months.
Specificity is very important in CRM data mining using predictive models. There are several types of methods used for this purpose. A univariate model compares a single variable to several other variables in order to determine the relationship with the highest correlation. Chi-Squared automatic interaction detection analysis (CHAID) and classification and regression trees (CART) models display decision trees, where one variable causes the instance of one or more variables. A multivariate regression model tests several variables against each other to evaluate possible correlations.