Essential Guide to Understanding Customers & Prospects – Session 3: Analysis Tools & Manipulation
Having seen some profiling examples & techniques in previous articles, you might now be thinking about what tools you need to help with analysing your data. There is a wide range of analysis tools some of which can be expensive and have high training requirements so you should invest some time making sure you are choosing the right tool for you.
We are generally tool agnostic. That is we buy and license our tools like everyone else, although OK we confess to having developed a few analysis tools along the way too! These are often to support specific application such as calculating sample sizes (Talking Numbers DM Calculator, or our TN Profiler which supports fast cross-tabulation of data, filtering and selections), or to support techniques such as Conjoint analysis. The point is that actually you rarely find all the functionality or capability you need in anyone tool. Life is never that simple!
Our analysis department has plenty of tools in the toolbox, and we all have our favourites for different applications. Along the way we’ve collectively used, played with and thrown out of the window many and various applications! So we’ve pulled together our respective experiences with some of the most commonly available tools, and put down our thoughts on what you might reasonably need to lever your customer data.
Firstly when thinking about your analysis tool requirements start with some of the following questions:
How much data do I have? Small datasets of a few thousand record can be comfortably wielded in excel or Access whilst larger volumes present different challenges, depending on what you’re doing.
So what do I need to be able to do with my data?
- Mailing selections (marketing campaigns, announcements)
- Counts (number of customers/suppliers/intermediaries/products/sales)
- Reporting (sales figures, regional reporting, new business)
- Data manipulation (manipulating string variables, mathematical functions, file aggregation, ransformations)
- Profiling (types of business, by region, postcode area, SIC, size, product holding)
- Statistical analysis (regression, segmentation, significance testing)
- Work with multiple datasets
- Everything else you can think of!
Do I also need a data repository?
Quite often an analysis datamart is really important as it allows an analyst to take snapshots of data, to trend and understand change over time, which is not always possible in a marketing database, as they have a habit of over-writing data to get to latest situation (how inconsiderate!)
How much do I want to spend on the new tool? (and how much are you allowed to spend? – often 2 different answers, but never mind)
How much time can I devote to training/what skills do I already have? (hugely important to be realistic. Whilst you might fancy individually adjusting neural nodes and building your Kohonen maps, the reality is you can just about squeeze 5 minutes to get that campaign report done in Excel.)
What different user groups do I have?
- Power users (programmers, statisticians, analysts)
- Medium users (reporting, counts, mailing selections)
- Light users (high level reporting, counts)
- Non users (business decision makers, recipients of reports)
What tools do I already have? Remember Excel can be a powerful analysis tool if your data manipulation requirements are low and you have small amounts of data to look at (less than 65k rows).
Some of the key things to look for when choosing your tool are:
Ability to meet your analysis requirements – now and in the future
How might your analysis requirements change over time? As your requirements change, can you purchase add ons as and when required or do you need to buy everything up front even if you won’t be ready to use it yet?
Functionality Tie this in with your requirements – can you manipulate data easily – what about those all important string functions? Can the tool handle multiple datasets? And what is the maximum capacity in terms of working with large datasets?
Do consider the areas of data import & export, and what level of user it is designed for – (I once worked with a tool which wasn’t designed for power users(!) so for one piece of analysis I had to perform 82 separate exports before I could build the analysis dataset I needed – it took longer to make the exports than to perform the actual analysis!).
Think about whether you will need statistical functions (e.g. T Tests, ANOVA, Chi-Square, Z scores, regression, cluster analysis etc)?
Compatibility with your current technology There is nothing more frustrating than purchasing an expensive tool on an annual license and then finding that it crashes your virus protection software or that you then have to also purchase a specialist server system to house it!!
There are many examples of well-known tools, including some of those mentioned in this article, that have failed because they clashed with other software on the network, the network wasn’t sized to support the data traffic, the operating system was old and not supported etc. etc. etc. Do speak to your IT dept!!
IT resource required (maintenance)Will you need to set aside IT resource for managing the installation/integration and ongoing maintenance of the tool?
Ease of use/Training requirements & cost of training
How easy is the tool to use? Do you have the expertise currently? Will you have to recruit a new analyst or can you train someone as part of their career development? How much training will be required? Is it costly? What support does the provider give (e.g. help line, online help, user manuals)?
Cost vs benefitsEven though your chosen tool may be expensive, the long term benefits you can gain from using your data intelligently can by far outweigh the cost! (I’m beginning to sound like a supplier here!! Rather than an analyst with lots of great tools. Ooer!)
Data manipulation
When we talk about data manipulation, we mean using the existing data and turning it into useable formats or creating new variables that you need for your analysis (e.g. turning number of employees into a banded variable “less than 20″, “21 to 50″, “50 to 100″ etc). Data manipulation could also be about cleaning your data – e.g. reformatting postcodes to make them all the same (e.g. GL7 1UP vs GL 7 1UP) or correcting known data entry errors (e.g. S being incorrectly typed as 5).
What kinds of manipulations are you likely to need to do? Well, think about the kinds of data you have. There are 2 basic data types – String (character) variables and numeric variables (including dates) which break down into the following:
- Spatial – e.g. postcodes, regions, counties, grid references
- Ordinal – e.g. business turnover banded, value band, number of employees banded, scored data (e.g. satisfaction – very satisfied to very unsatisfied or agree strongly to disagree strongly)
- Categorical – e.g. gender, SIC, company name, account manager
- Interval – e.g. income, number of employees, turnover, distance travelled
If you have a lot of spatial data and want to look at things like sales territory planning, do you need a mapping tool to plot them on a map? You may need to be able to perform string manipulations e.g. to reformat postcodes to make them uniform or break them down into postcode area, district & sector.
Think about the need to sort data, merge data files, create new variables (e.g. creating a region variable based on postcode). Will you need to look at sums, means, medians etc of numeric variables?
Think about how you might need to use dates in your analysis. Most analysis tools (including Excel) actually store dates as a number – the date you see is just a format applied to the number to make it readable. This means that you are able to subtract dates from one another to determine how long ago something happened in days, months or years. The treatment of dates could be important when looking at compatibility between other tools you have e.g. SPSS stores dates as the number of seconds since 14/10/1582:00:00:00 whereas SAS stores dates as the number of days since 01/01/1960.
When it comes to data manipulation the possibilities are endless – usually only constrained by the tool’s functionality. Some tools have a point and click or drag and drop interface which you can use to build up formulae or syntax statements – this can be very helpful if you are new to writing syntax or programming but can be restrictive. Others have a programming or syntax writing interface where you have the freedom to write syntax to achieve your required result – this is a very powerful & flexible way to manipulate your data but can have a high training overhead!
In summary, when choosing an analytical tool, make sure you:
- Involve all stakeholders in your organisation right at the start
- Define your analysis requirements
- Understand the skill level involved to understand & work the tool
- Plan for any resource required & timescales (including training!)
- Understand how the tool will interface with existing systems, and whether it will crash the network
- Research research research!!!
- Evaluate at each license renewal
Our take on a few of the solutions out there.
| Package | Positive | Negative |
| Excel | Everyone has it, everyone uses it. Easy to use, and does have advanced facilities. It’s ability to interface with other packages and also to write Visual Basic applications makes it highly useful as a deliverable of analysis results, “what if” and scenario planning. We love it. | Restricted in functionality. Low record limit, and oddities in the way it treats data |
| SPSS | Powerful stats package. Menu system makes it more accessible to casual users. Can also programme using SPSS syntax for more sophisticated data manipulation. | Daunting for the occasional or non-statistically qualified. To get most of power – need to use syntax. |
| SAS | Very powerful statistics, data manipulation & data integration – many integrated products for everything from data warehousing to web analytics & reporting, campaign management & industry specific solutions. Also excellent for using multiple datasets. | Daunting. To get most of power – need to use programming. |
| Fast Counters | Can produce summary statistics very quickly. | Less powerful for modelling and advanced techniques. |
| SQL | Very good data manipulation & data warehousing – numerous analysis add-ons. Excellent for using multiple datasets. | Less useful for more advanced analysis. To get most out of it – programming is required. |
| Alterian | Campaign management, selection planning, reporting, analysis, targeting, e-marketing. | Can be restrictive as there is no function for sophisticated programming. Note that the tool works in a record-based way rather than rule-based, which has important implications on data change & import to the system |
| Business Objects | Data integration, reporting, web reporting, report distribution, analysis. | Can be restrictive as there is no function for sophisticated programming. |
| Bespoke | Good at solving specific problems. | Can be limited in functionality. |
There are other tools that perform more specific operations for example:
Web Analytics
- Intellitracker
- Webtrends
- Site Intelligence
- Speed-Trap
Specialised
- TN DM Calculator (useful for looking at mailing selections & campaign rollout)
- CHAID
Fast Counting
- FastStats Discoverer
- TN Profiler (useful for cross-tabulations and filters on large datasets. Low cost)
Geographic Mapping
- MapInfo
NB: There are more solutions out there than those listed. This is not a definitive list, but a few of the more common ones. You should contact the various suppliers to see how they can help you with your specific requirements and for cost estimates.
Some useful links
Web Support links
User Groups
- SAS Forum
- “Excel Forum
- MS Access Forum (also includes other Microsoft tools – e.g. SQL & Excel)
Syntax Libraries

Leave a Reply