Microsoft CRM V3 was designed with performance in mind. We dedicated significant resource bandwidth and timeline to ensure that we tackle performance issues early and reliably. We laid out a performance plan during the engineering cycle, followed it closely, reviewed it frequently, revised it as and when necessary and adhered to it as part of our overall release criteria. In this blog, I will highlight some of the key aspects of this plan.
Performance Plan
With previous version of our product, we had a few known areas for improvement therefore our performance plan required dedicated attention to these areas. However that alone wasn’t sufficient, we needed to ensure that we could reliably measure key performance indicators and ensure that we don’t falter as product goes through engineering cycle. We also wanted to make sure that our partners and customers could benefit from our investments and plan their deployments with reliable performance characteristics.
With these broad goals in mind, we incorporated following elements to our overall performance plan.
- Depth Test – Address known problem areas
- Breadth Test – Establish a repeatable performance benchmark
- Stress Test – Identify and address product limits under stress
- Performance Toolkit – Leverage in-house work to help our partners & customers plan their deployments
Depth tests included targeted projects that addressed well established product areas where performance was a suspect and where key team members – program manager, developer(s) and tester(s) had targeted on-going investigations. A key element of the revised performance plan was to incorporate a breadth element – A CRM Performance Benchmark that exercised product features against a well known configuration and that provided results with-in acceptable degree of variance each time. The benchmark would allow us to measure product performance repeatedly, build over build and ensure that our performance did not worsen with any code changes.
The benchmark also would allow us to put product in severe conditions (low memory, high transaction rates, large databases, low network bandwidth….) and identify its limits.
We wanted to make sure that our performance investments were not adhoc but strategic that we could use ourselves in future releases and that we could share with our partners and customers so that they can plan their deployments with predictable and reliable performance characteristics.
The Benchmark
We set out to build a performance benchmark that reflected not only our product features but also how we envision our customers using our product on a regular basis. We engaged a third-party with expertise in performance benchmarking to help us establish such a benchmark. The end result was a measurement model that included following elements.
- Real world scenarios
- Organization profiles for test data-set and transaction rates
- Target response times and acceptance criteria
To model real world usage, we profiled our users in to various personas from Business Executives such as VP of Sales/Marketing who is primarily interested in reports; Sales / Marketing / Support Professionals who is working with Accounts & Contacts on various day-to-day activities; Receptionist who is primarily dealing with appointments; Administrator who is updating settings, customizing product etc… To identify scenarios we incorporated third party research data (partially gathered from our existing customers and partially from industry research) and we engaged each of our product feature area engineering unit team to estimate usage of their newly identified features and validate third party research data. As a result we had a collection of scenarios that had usage statistics across all personas.
A scenario consisted of end to end operation such as create an opportunity – that involved navigating to leads, opening a lead, updating the lead, converting it to an opportunity, updating it, attaching a note to it and saving it. Each discrete operation was identified as individual web test. Thus scenarios consisted of one or more web tests and they shared common web tests such as navigating to a page, looking up a particular record, etc…
We profiled our organizations and broadly classified them in three categories – small business customers who purchase our Small Business Server Edition and have very small concurrent user base (10 or less), medium size businesses who have 100 or so concurrent users and departments in large organizations who have 1000 or so concurrent users. We specified distinct deployment topology for each organization profile (a single small business server for the first, a dedicated CRM Server, dedicated SQL Server for the second and high-end SQL Server (64-bit edition) for the third). We also built the data set for our scenarios that provided depth of data that was realistic. For example an Account had to have a few contacts, a few activities, notes and account hierarchy (sub accounts, parent accounts…). Our performance test team built data population tools that could generate the random data samples as well as the depth based on specifications in an XML file (more details later in how you benefit section).
We set target response times for the scenarios (i.e. for each of the individual web test with-in the scenario) and were mindful of organization profiles, their transaction rates and the hardware topology. As the tests ran from build to build and the observed response times were not expected to be identical each time (network, I/O, number of times each web tests ran etc as contributing factors for the variance), we specified acceptance band that allowed a small degree of variance as below.
- Green – 10% variance or 1 second absolute value
- Yellow – 50% variance or 5 second absolute value
- Red – anything worse than yellow
With all necessary elements in place, we had a credible benchmark that all we had to do was to implement and put it in regular use.
Putting it all together
We experimented with performance and load simulation tools that we could use. An important aspect of our overall performance plan was to make sure that we could allow our partners and customers to reuse our work for their own configurations therefore an expensive off the shelf performance tool was ruled out. We looked in-house for available tools and found the Visual Studio Performance Tool to be a good starting point. Our test team had already built great data generation tools (tool also accommodated time element for appointments so that service scheduling scenarios had resources with pre-existing appointments in their calendars far more in number in near future than in distant – much like our real life work calendars). Our test team then diligently implemented all the web tests to mimick our scenarios. We brought scenarios on-line in groups first simple web tests, then we incorporated Outlook Client synchronization (simulated at the server), we then brought in reporting, service scheduling, marketing automation, bulk imports, workflow… Slowly and slowly we had entire benchmark online. We encountered issues for sure. Each time we hit a snag, we had corresponding feature team engaged closely. Reporting feature team, Service Scheduling feature team, Marketing Automation feature team, Workflow feature team all had their fair share of woos when we brought all these elements together and much to their credit, they identified and resolved issues timely and across feature boundaries to keep the benchmark up and running.
We published results on weekly basis and had the team excited about making the lights go green. Below is the screenshot of the actual report we published before the release of the product where we ran over 100 tests and most passing in green band.
| ||||
Scenario (Test Script) | Goals | Build 5289 | Build 5294 | Build 5300 |
AccountActivityRollup | 2000 | 897 | 3144 | 2602 |
… | … | … | … | … |
CreateEmail | 10000 | 2677 | 8862 | 7492 |
CreateList | 3000 | 228 | 377 | 408 |
DeleteContact | 50000 | 1117 | 28313 | 7430 |
… | … | … | … | … |
DisplayAppointments | 20000 | 9871 | 14301 | 17727 |
… | … | … | … | … |
SalesHistoryReport | 30000 | 0 | 29309 | 11192 |
SalesPipelineReport | 30000 | 2609 | 21235 | 8201 |
… | … | … | … | … |
SearchAvailability | 10000 | 3352 | 0 | 12299 |
… | … | … | … | … |
SyncToOutlook:DoPrepareSync | 30000 | 11977 | 24837 | 22434 |
SyncToOutlook:SyncItemByTypeForContact | 10000 | 712 | 728 | 900 |
SyncToOutlook:SyncItemByTypeForTask | 10000 | 189801 | 4486 | 5679 |
How you benefit?
We have already released the performance toolkit. You can download it and start customizing it for your topology and project usage. For example you may choose to have lighter or heavier data set than we used in our benchmark. You may choose to alter the transaction rates and frequencies for different web tests. This allows you to model the environment to your taste. In our case, we chose to model Sales, Service, and Marketing activities evenly but your organization may be sales or service centric and therefore may choose transaction rates accordingly. With appropriate scenarios, their frequencies and dataset in hand, you can next determine your target hardware and run the tests to see the response times. With new generation of hardware machines you can enable/disable processors to see if a dual proc would suffice or you would rather have a quad proc. You can do the same to identify the optimal memory size.
Carrying over the work in future
We started the performance work as a planned milestone in V3 and therefore went through early fits and starts. It took us a while to get all elements – the plan itself, the scenarios, the tools, the reporting etc… in place. As we start building next generation of our product, we have incorporated this process in our engineering milestones. With V3, we have laid a good foundation that we continue to use and refine further. We are in the process of improving our benchmark to incorporate new scenarios, identify new topologies and workloads. We are improving our tools. Our product team is excited and fully on board with getting us all green builds.