Skip to main content

Approach to Start the Data Profiling

So you got the large / medium enterprise legacy systems or may be a ERP system in your organization and decided to profile the data you have. The first step is to decide what all data you are going to work with. Normally spend analysis has to be done on your procurement, material data. So material master, vendor master, MRV, PO, part master are the ideal candidate to start.
While extracting the data you need to very careful as if you miss some critical fields or required fields for analysis - which you will come to know at very later stage and then everything starts from the scratch. E in the ETL process is a large subject in itself to talk. So we will not get into that right now.

Here I will assume that, you are there. You identified the fields correctly and then went ahead with the generating those text files - with proper delimiters. :-).

And you are ready for running profiling. Here we have two options. A semi automatic database driven approach and fully automatic tool dependant approach. both are good and having their own advantages. We will see -

1. Database driven approach: Here you load all the extract text files in say MS SQL Server database server specific table. You write some routines which will tell you about miss and hit. If you know the data structure and the underlying data well and most importantly - how to write good generic queries then this option is for you. I will talk about this in my next writeup.
2. Automatic tool dependant approach - There are several tools available in the market just to do this kind of work for you. Choose one of them and you are done. I know Exeros X-profiler.
The tool let you create a project, link a text file or major type of database as a data source and give you a profiling report. It fulfills most of the requirements that we need. So its good.
Look for my next post comparing these things - manual approach , profiler tools and all.

Comments

Popular posts from this blog

Bristlecone Webinar on Supplier Risk Management

Purchasing Magazine, Bristlecone and SAP getting together to bring you very good discussion on supplier risk management. Webinar will be held on June 24, 2 to 3 PM eastern Time, USA. Contributors are - Paul Teague from Purchasing magazine , Jason Buch from spendmatters , Naresh Hingorani from Bristlecone - Supply chain leader company and Padmini Ranganathan from SAP. As we always talk about spend visibility, data issues, strategic sourcing - these distinguised speakers will bring out more strategic views to the table on how all this can be achieved to analyze your business risks better, upfront. You can register for event by accessing this link and register See you there.

Master Data Management – Product or Process ?

I have 2 SAP systems and I want to fix my material master, Services Master. I want all that data to be clean, standardized, classified, enriched and load it back to my SAP in next 6 months. What do you suggest ? Chris - one of my key client was explaining during a “solution understanding” call. My sales manager Tom, enthusiastically started talking about new version of the MDM platform by ERP company, tools, technologies, product landscape, licenses etc. After 30 minutes of sales pitch, I could see confusion on Chris’s face clearly. He said - but I don’t want to add any new product in my infrastructure for all this. Can you just implement MDM for me without I adding any new software ?   Both are using MDM implementation as a keyword, but in a completely different context. Chris wants to implement MDM as a process while Tom was trying to sell MDM as a new software. Whats the difference ? Lot I will say. MDM as a product – when you sell a   software license to a...

This is the time look at your data, your Spend Data

Hello Everybody. I was missing the action on this blog since around 4 months now. Actually I was too busy with new data management engagement in my new role. As its a vacation season, its time to gather thoughts and share it..... Hope I will continue... As the title of this blog says "its a time to look at your data, spend data". YES, as everybody is hard pressed for cash in the difficult time like this - where to look for it? I have one answer - Look at your own transactions to see if you can save some there. Lets check what are you spending on, categorise your spend well.Check if you know where you can negotiate with your vendors and go for it. You will amaze to see how you have ignored this point in the past and how much you can save by doing this. My experience tells me that you can get benefit of atleast 5 to 7 % of savings by doing this activity. Wondering how - Keep reading .... Lets make it simple. You have repository of suppliers, and lots of transactions for those s...