Skip to main content

Posts

Showing posts with the label Data Profiling

Why should I Profile my Data?

You have got an enterprise system for your organization since long time. Do you know how much data quality issues you may have in your data? What does that mean ? 1. You may have different date formats - like mmddyyyy, ddmmyyyy, dd-mon-yyyy and so on. 2.You may have unit of measurement (UOM) inconsistencies issues. - Somebody putting the data in inches, one in cms and one in feet. 3. Have you considered the currency conversion and exchange rate issues. 4. And how about the material code ? One coding it as ABC123, other as ABC-123 and another as ABC 123. All are same, but when you search you dont find the material ABC123 and you order another 1000 quanitities, when same material named as ABC 123 is there in your warehouse. 5. Do you know the vendor ABC, AB-C, AB Corp and AB Ltd are same under one group? All these issues are eating in your money. Directly or indirectly. Just under your nose, because they are the issues, you are going through everyday, but just ignoring it due to invisibi...

Profile, Cleanse, Classify, Enrich your Data to get spend visibility

SO yes now you know that you need to have data in correct shape - but how to start all the juggernaut? I will tell you - this is a step by step process. So you need to identify your needs first. How to do that ? Profile your data using simple queries on database server like MS SQL server, Oracle etc. There are sophisticated tools available in the market who can profile data for you. What does this profiling meant to you - it analyze your data and will tell you how much dirty or incosnistent data you have. You may have some null values, date format issues, numeric and character data collision issues, binary data like male, female etc. Now once you know dirtiness of your data atleast you know where you stand. Now the next step is to analyze results and estimate the cleansing efforts. I will tell you that soon. Watch Out.