Expectations of Service Level Expectations
Recently, ICANN staff has been working to meet the community requirements related to establishing metrics for the performance of the "Post Transition IANA" as it processes root zone management requests. For those not following the development of the Service Level Expectations, the agreed upon principles for the monitoring, reporting, and assessment of these requests were:
- Attributable measures. Where practical, individual metrics should be reported attributing time taken to the party responsible. For example, time spent by IANA staff processing a change request should be accounted for distinctly from time spent waiting for customer action during a change request.
- Overall times. Notwithstanding the previous principle, there is value in overall metrics being reported to identify general trends associated with end-to-end processing times.
- Relevance. There should be a distinction between metrics that should be collected to support general analysis, versus which are the critical metrics that are considered important to set specific thresholds for judging breaches in ICANN's ability to provide an appropriate level of service.
- Clear definition. Each metric should be sufficiently defined such that there is a commonly held understanding on what is being measured, and how an automated approach would be implemented to measure against the standard.
- Definition of thresholds. The definition of specific thresholds for a performance criteria should be set based on analysis of actual data. This may require first the definition of a metric, a period of data collection, and later analysis by the community before defining the threshold.
- Review process. The service level expectations should be reviewed periodically, and adapted based on the revised expectations of the community and updates to the environment. They should be mutually agreed between the community and the IANA Functions Operator.
- Regular reporting. To the extent practical, metrics should be regularly reported in a near real-time fashion.
As of 2 March, the necessary code changes to enable the collection of timing information for the Service Level Expectations have been in production. The next step may be to consider what we'll do with the data collected. In this post, we'll discuss how we got to where we are and what to expect going forward.
Background: Setting Measurement Requirements
A Cross-Community Working Group (CWG) Design Team, specifically ("DT-A"), was tasked with developing performance standards. DT-A developed a set of metrics to collect in order to assess the service levels provided by the IANA root zone management function. This development process included ICANN staff as subject matter experts who walked through the detailed root zone management process with the Design Team explaining how it works and discussing the implications of various proposed measurements. ICANN staff contributed our explanations of the typical processing flows, as well as sharing unusual cases that may make some measures more complicated than they may appear at first glance. While ICANN staff did not set the metrics, we did advise the Design Team when we thought certain measurements would be impossible to collect. The conclusion of this work is the Design Team's final report, which formed part of the CWG proposal.
Implementation: Adding New Measurement Capabilities to RZMS
Since the completion of the CWG proposal, the team that develops ICANN's Root Zone Management System (RZMS) has worked to instrument the system to capture system events that relate to the proposed new metrics. This was necessary as some of the measurements from the Design Team's work relate to fine details of certain process flows that weren't recorded originally. As mentioned, this work has been completed and the first version that supports this new logging was deployed on 2 March 2016. ICANN staff will be reviewing the first batches of log data to determine whether further refinement is needed to successfully calculate and report against the SLEs.
Collecting Initial Data
With the new performance statistics collection capabilities of the RZMS in place, ICANN staff has begun collecting raw data. These data will form the basis of future discussions with the broader root zone management community, i.e., the many organizations that manage TLDs. In these discussions, we expect to explore which metrics are key and what the Post Transition IANA's performance against those metrics should be. The CWG proposal requires service levels be set using analysis of actual data, and this data will be used in that analysis. Given the time constraints associated with the transition and our expectations that the metrics must evolve over time, we believe an initial data collection of three (3) months will sufficient to set a baseline for the initial thresholds as discussed below.
Data Aggregation and Reporting
While the initial data collection is underway, ICANN staff will develop the necessary systems and tools to aggregate the raw data flow and convert the data into the formats the community expects. These formats will include a dashboard view suitable for real time publication, and any regular reports that need to be generated. In addition, a version of the raw logs with confidential information redacted will be published allowing third parties to perform their own analysis and measurements as well as providing a way to independently verify our interpretation of the data. We will share our progress as the data collection and tools are developed and publish the data as soon as we gain confidence in their accuracy.
Setting Thresholds and Service Levels
With sufficient data in hand to perform comprehensive analyses, and with the tools necessary to analyze and report the data deployed, we will engage with the community of TLD managers to identify what the "critical metrics" are, and what their thresholds will need to be. These thresholds, once agreed between ICANN staff and the community, will be the new root zone management performance standards in the post-transition environment. Beyond this, the regular feedback mechanisms defined in the transition proposal, including engagement with the newly formed Customer Standing Committee (CSC), will inform the future evolution of the reporting and expected performance of the IANA root zone management.
We have begun collecting the metrics the community and ICANN staff agreed upon. ICANN staff looks forward to working with the community as the data related to timing of root zone management actions by staff, community, and others are obtained, collated, and made available to those interested in keeping with ICANN's commitment to openness and transparency. We anticipate that a period of three (3) months of metrics collection will provide sufficient data to set performance targets for IANA. After the transition has been implemented, the CSC will be able to evaluate IANA root zone management performance and adjust specific service levels to best meet the community's needs.