Skip to main content

Approach to Start the Data Profiling

So you got the large / medium enterprise legacy systems or may be a ERP system in your organization and decided to profile the data you have. The first step is to decide what all data you are going to work with. Normally spend analysis has to be done on your procurement, material data. So material master, vendor master, MRV, PO, part master are the ideal candidate to start.
While extracting the data you need to very careful as if you miss some critical fields or required fields for analysis - which you will come to know at very later stage and then everything starts from the scratch. E in the ETL process is a large subject in itself to talk. So we will not get into that right now.

Here I will assume that, you are there. You identified the fields correctly and then went ahead with the generating those text files - with proper delimiters. :-).

And you are ready for running profiling. Here we have two options. A semi automatic database driven approach and fully automatic tool dependant approach. both are good and having their own advantages. We will see -

1. Database driven approach: Here you load all the extract text files in say MS SQL Server database server specific table. You write some routines which will tell you about miss and hit. If you know the data structure and the underlying data well and most importantly - how to write good generic queries then this option is for you. I will talk about this in my next writeup.
2. Automatic tool dependant approach - There are several tools available in the market just to do this kind of work for you. Choose one of them and you are done. I know Exeros X-profiler.
The tool let you create a project, link a text file or major type of database as a data source and give you a profiling report. It fulfills most of the requirements that we need. So its good.
Look for my next post comparing these things - manual approach , profiler tools and all.

Comments

Popular posts from this blog

Master Data Management – Product or Process ?

I have 2 SAP systems and I want to fix my material master, Services Master. I want all that data to be clean, standardized, classified, enriched and load it back to my SAP in next 6 months. What do you suggest ? Chris - one of my key client was explaining during a “solution understanding” call. My sales manager Tom, enthusiastically started talking about new version of the MDM platform by ERP company, tools, technologies, product landscape, licenses etc. After 30 minutes of sales pitch, I could see confusion on Chris’s face clearly. He said - but I don’t want to add any new product in my infrastructure for all this. Can you just implement MDM for me without I adding any new software ?   Both are using MDM implementation as a keyword, but in a completely different context. Chris wants to implement MDM as a process while Tom was trying to sell MDM as a new software. Whats the difference ? Lot I will say. MDM as a product – when you sell a   software license to a...

Data Management - Terminologies & Definitions

As a third step in my Data Management article series – lets look at commonly used terminology in the domain. Now these are very standard definitions I am quoting from a standard available glossary. The next step – next article would be to explain the relevance and usage of these terminology in business world. E.g. How to look at data standardization in supplier data context or material data context – when it comes to optimizing your procurement processes. That’s next. In my first article in this data management series –I compared data management with the story of elephant and seven blind men. http://manageyourdata.blogspot.in/2012/09/data-management-elephant-seven-blind-men.html The second post is more about – why its important to speak same language when you are running any data management initiative.   http://manageyourdata.blogspot.in/2012/09/data-management-are-we-all-speaking.html Data analysis : Analysis of data is a process of inspecting, cleaning, transfo...

Journey of procurement transformation begins with..….. Part II

 Original Blog post - https://www.linkedin.com/pulse/journey-procurement-transformation-begins-part-ii-prashant-mendki Procurement transformation journey is complex, cross functional, time consuming and even frustrating at times. The very basic but a strategic step to start this journey is “Spend Analysis”. Again – this has to be done in a right way to get the potential benefits. We talked about that in first part of this article https://www.linkedin.com/pulse/journey-procurement-transformation-begins-part-i-prashant-mendki By definition – Spend Analysis is an analysis of your spend (invoice paid), what items you are spending on (product), who you are paying to (supplier). It looks really simple – no? When I worked with one of the large Media Entertainment company few years back, they had thousands of suppliers, millions of transactions, good amount of Maverick spend. It’s a global business with more than $2Bn in Spend, 12 different global systems. Thousands of trans...