This post originated with a conversation I was having with a potential customer who was considering buying Dynamics CRM. He had been prompted by his business partner to ask about integration with a well-known accounting application. Normally, an early question in such a conversation is ‘what data integration do you want, and why ?’, and this question was indeed asked; what was unusual was that it was the potential customer who asked me, rather than the other way around.
I’m not going to be able to answer this question to my satisfaction in one post (I may well come back to it in future posts), but what I will do in this post is introduce the main potential benefits of data integration, and explain what is typically the lowest cost approach to achieve some of those benefits. Be aware that this post is mostly focused on design considerations, and is intended to give a general view on what can be done, rather than working as a specific ‘How To’ set of instructions.
Why is there a need for data integration ?
A starting point is having 2 or more systems with structured data; the different systems normally having differing scope and purpose. A common, and relatively simple, example of 2 systems is Dynamics CRM and a finance application. If one system could do everything well, then we wouldn’t need any data integration, but this is rare. In this case, Dynamics CRM can handle some finance functionality (e.g. simple invoicing), but not others (e.g. cash flow). Similarly, most finance applications have some CRM-like functionality (e.g. manage customer contact details), but not others (e.g. interaction with customers before they buy anything). So, we have 2 systems with different uses, but there is overlap between them; for example they both store data about customers, and they may well also both store product information.
Benefits of data integration
Our two systems could run completely independently of each other, but there are several potential benefits to integrating the systems in some way. For example:
1. Having a centralised view of related data which rightfully originates from different systems. An example would be an account overview that showed data that originates in CRM (current opportunities, recent cases), as well as data that from the finance application (recent payment history, credit rating)
2. Reducing manual effort (and risk of errors) from entering the same (or very similar) data into more than one system. If a Dynamics CRM user has entered information about a sales order and the products on that order, then should you require a finance user to re-enter that data into the finance application to raise an invoice?
3. Maintaining consistency of duplicated data. Here I’m referring to less volatile data, such as customer details. You would expect each system to store a customer’s address, but how do we ensure the address details are the same?
This post will concentrate on the first benefit, and how it can be achieved using a reporting approach, but first let’s consider the range of options for achieving this benefit.
Data integration options for a centralised view
There are 3 broad approaches to get a centralised view of data:
1. Leave the data in the source systems, and use a reporting mechanism such as a Reporting Services report to combine the data from these systems
2. Implement a process to copy data from one or more systems into another system, giving one system with all the data you want. For example, you could bring data from the finance application into the CRM database. Dynamics CRM has some useful attributes on the account entity (e.g. creditonhold), and you could create additional custom entities and attributes to store data such as the payment history
3. Build a data warehouse with data that is imported from each source system. I’ve included this option for completeness, but a data warehousing project is very different from a data integration project, and I won’t cover it further here.
Comparing the first two approaches above, the main advantages of the reporting option are:
- It is invariably the easier (and hence cheaper) option to implement. The main technical work to do is to manage identifiers (see below), and understand the data structures in the different systems, and this work also has to be done to copy data. After that, designing a report is normally significantly less effort than building a robust process to copy data
- The data displayed is always up to date, as it comes directly from the source system. Most data duplication processes have some latency within them, which may or may not be significant
Whereas the main advantages of copying data are:
- If you copy data into another system, you have more options for searching and displaying data in that system. For example, if data in the creditonhold attribute of the CRM account record was populated you could run an Advanced Find in CRM to find all accounts on hold, but if that data were only in the finance application you don’t get the same flexibility.
- This option makes it possible to edit, as well as view the data outside of the source system. Although this is a potential benefit to the users, it raises a lot of implications around synchronising changes and security. Allowing data editing dramatically increases the complexity, and hence cost, of the data integration work
Other factors can have an impact, but could work either way:
- Security. I would generally prefer to manage security solely in each source system, which lends itself more to the reporting option. However, this would require users to have access to each source system, which may be a problem
- Licensing. The reporting option may require more user licenses for one or more systems, but the data copying option may introduce the need for an External Connector license or equivalent
- Technology to access source systems. The reporting approach is based on being able to access each system as a relational database for SQL queries, which may not always be possible
In most cases, the fact that reporting across systems is easier and cheaper is the main influence on the decision, and the rest of this post will look at how this can be achieved, starting with identifiers.
As mentioned earlier, a significant part of the main technical work for any data integration approach is the management of identifiers. It must be possible to reliably relate a record in one system to the corresponding record in another system. Each system will have one or more attributes that identify a record; for any data integration to be robust we must use identifiers that are unique within their own system, and will not change. This rules out attributes that may change, like the account name, for example.
Two common options for identifiers are:
1. Use the built-in ones of each system (every system must have a viable identifier), and store the identifier from one system in an attribute in the corresponding record in the other system. For example, when integrating Dynamics CRM and GP, we normally store the account identifier from GP (an identifier in the form A001) in a custom attribute in the CRM account record (see below for how we get it there)
2. Generate your own identifiers, such as an account number, or product code, and store them in each system (almost all relevant systems would have an appropriate attribute already for these, but you could use a custom attribute if you prefer). These identifiers could be generated within one of you systems, or outside of them. What matters is they have to be unique within each system
Once you’ve decided on your identifiers, you have to ensure they are correctly entered into each system. This should be done as part of any initial data migration process, but the main consideration relates to subsequently created records. It is important that the business process for creating new records is agreed, and adhered to; it is much easier to manage the identifiers for accounts, for instance, if you know that a new account is always created in CRM first, then in the finance package, compared to a scenario where an account could be created in the finance package, and you could not be certain if it already existed in CRM or not. Once the business process is agreed, then we can implement a mechanism to generate (if required) and store the identifiers.
Although the management of identifiers is fundamental to effective integration, it is questionable whether it is worth creating an automated process, or whether to retain a manual element to it. The reason it is questionable is that the process should only be relevant on entity creation (and deletion, if permitted), and it may be that certain entities are so rarely created that it’s not worth building an automated system. For example, if you only create a new account once a month on average then a manual approach would seem appropriate, but it wouldn’t if you had 1000 new accounts a month.
Here are a couple of examples of how we’ve managed the account identifiers between Dynamics CRM and GP for different customers with different business processes:
1. An automated system. Accounts are always created first in CRM, and never manually created in GP. To create the account record in GP, the user clicks a button on the CRM account form. This is a ISV button that calls integration code to generate a new account record in GP, reads the generated account identifier for the GP record, and stores it in a custom attribute on the CRM account entity
2. A semi-automated system. Accounts are manually created in CRM, and also manually created in GP. To link them, the user also clicks a button on the CRM account form. This is another ISV button, but this code reads account information from GP and displays to the user a list of GP accounts that are not related to CRM accounts. If the user selects an account from the list, then the GP identifier for the account is stored in a custom attribute on the CRM account entity
These are not the only options, but they were appropriate based on the business processes (in example 1 we could dictate the process, whereas in example 2 we had to adapt to existing processes) and frequency of account creation (which was relatively rare in example 2).
Before I move on, I’ll raise a few other related points that I don’t have space to cover in detail:
- In my examples, it is perfectly reasonable for account records to exist in CRM, but not in the finance package, and this reflects most business processes. In general a CRM application will contain account records for potential customers, whereas a finance application normally only needs to store data about customers who have bought something. This is why neither example ran when an account was created in CRM
- An entity like the product entity may be handled differently. A common approach would be to use a batch process to upload a new product catalog, complete with externally generated identifiers, rather than the ad-hoc record creation described above
Creating and displaying reports
Once the identifiers have been established, creating a common view is relatively straightforward, though again there are a couple of options:
1. Display data from one system within another system. A classic example here would be to create a reporting services report that displays data about an account from the finance system, and takes the agreed identifier in the finance system as a parameter. This report could then be displayed in an IFrame in the account form within Dynamics CRM (see http://blogs.msdn.com/crm/comments/5275605.aspx for an example of the general techniques involved in displaying reports in an IFrame in CRM)
2. Develop a report that displays data from multiple source systems. If using reporting services, the simplest option is to create a report with 2 or more data sources, one per source system. This works fine if different report elements (e.g. tables, charts) are to display data from different systems, which is most commonly the case. It is also possible to combine data from different source systems within one report element (e.g. a chart that displays amounts invoiced (from the finance system) along with projected income (from opportunities in CRM)); you can do this using the Linked Server functionality provided by SQL Server (see http://msdn.microsoft.com/en-us/library/ms188279.aspx ).
This post has covered the major considerations involved in building an effective data integration between CRM and other systems using a reporting approach, which is typically the most cost-effective approach. Most of the implementation information in this post relates to managing identifies; this was deliberate as this is the key part of any data integration.
I also covered other potential data integration benefits and approaches, and may come back to these in future posts. The information about identifiers will be equally relevant to these other approaches.