During my long stint as business intelligence professional, I have seen many projects fail and of course, smelled some success. Let’s see what is common among successful projects and the unsuccessful ones.
Organisation Structure: Unlike other IT engineering projects, Business Intelligence projects need strong business interface – a make or break for BI project. Business is divided into groups or sliced by business functions (BU) – but data is not. A typical BI project will run into multiple Business Functions – which means working with two or more VPs and their organisations. If BI project is sponsored by IT it will hit bottleneck with Business. A typical reply will be – a.) we already have this information in XLS b.) Our processes are different , c.) we can not wait so long for data d.) this is done at vendor site
And business is right. Every business functions or unit is different – Granularity is different, business focus is different, workflow is different. IT sponsored BI projects treats all business units same and hence BI projects lose relevance for business. I have seen high rate of success with MIS related BI projects – because IT is both consumer and sponsor of project. Otherwise IT does not business rules and processes to make a report relevant – that has to come from Business. So it is important business sponsors it and business unit heads or VPs sponsor it.
Enterprise Data warehouse (EDW) : A typical approach to a BI project is, create an enterprise data warehouse and all BU ( business Unit) will take data from there and have their data mart. Business is complex and putting all rules in Enterprise warehouse has a very high degree of failure. Enterprise data warehouse become too cumbersome to use and invariably business rule will change overtime. Cost of changing ETL jobs are very high and it takes long time to propagate the change from source, to staging area, to target , to reports, to analytic. Business get frustrated and figures out a way to get data in xls sheet and then does it own analysis – EDW fails.
A leaner and less complex data model is better. Report Developer, ETL developer or Data Integration are not business analyst. They are bound to do mistakes in putting all the rules in all encompassing warehouse. A good amount of business rules can be pushed to report engine and analytic engine. A focused warehouse is more successful and a multi-purpose or generic warehouse.
Data Quality: In real business, dirty data or incomplete data do come in. If they are feed into warehouse as is, report is faulty and unfortunately it happens most of time. Technical people can understand data type, data structure and raw data quality – like null values, duplicate values, negative values, outliner etc. But they do not understand business implication of that and do not what should be right value. A typically ETL tool will discard those records and report will not be updated. Let’s take an imaginary business rule – where if an existing ( from other business unit) customer walks-in you do not ask him fill-in personal information form and only take customer id. ( The idea is you will collect all the data from other business unit and save time of customer). When desktop operation feed the data in – he or she feeds only with customer id. If you business processes are not real time ( in most cases they are not ) only customer id goes into CRM system and most probably nightly ETL load will ignore data as dirty data and your report will show one less count.
Data Analyst and Business Analyst have to sit together and profile their data and look into conditions using tools like osDQ – http://sourceforge.net/projects/dataquality/ to get a good understanding of data , before moving onto project.
Role of Data Architect: Probably he is the first person to know if the project is on track but he has limited visibility. Most of time, data architects belong to IT group and has very limited saying in Business Unit. Unfortunately, today we do not have a process or framework which tell how data architect should talk / show artifacts to business. TOGAF has tried to give some framework, but it very limited.
A good start can be starting from IT Domain architecture where business unit and high level functionality is mapped. Let take an example of company which creates, tests, scores and report K-12 tests.
Once the business domain is identifies, data architect should create DFD ( data flow document – like image below ) which says which data moves across business domains and which is the data lying within a domain. The data is flowing across business domains ( or units) are the once which are more prone to error – as it changes values across domain.
Once the DFD is created , Entity reference model can be created as below where steward can be identified.
Once we have formal steward in place from business side, the success rate increases . In my next post I will write in detail about roles and responsibilities of data architect and the process to find out steward.
Good Luck !!