New England Environmental Goals and
Indicators Partnership

Summary Report on the Assessment of
Core Performance Measures and Regional Indicator Development

March 1999


Prepared by the Green Mountain Institute for Environmental Democracy,
for the NEGIP Steering Committee.

Since 1995, the New England Environmental Goals and Indicators Partnership has served as a regional forum for the consideration and development of innovative environmental management tools. Through this partnership, the New England states and EPA - New England have explored the specific application of environmental performance and management reporting. In 1997, NEGIP turned its attention towards GPRA, the NEPPS process, and Core Performance Measures to help the region use these national measures, and to ensure that they add value to regional and national reporting, and to state management activities.

--------------------------------------------------------------------------------

Abstract
The New England Environmental Goals and Indicators Partnership (NEGIP), with assistance from the Green Mountain Institute for Environmental Democracy, conducted an extensive evaluation and comparison of data sources available in the region to support a sample set of measures. This exercise yielded several findings and recommendations on the development of Core Performance Measures, and the development of indicators generally. A key finding is that the level of consistency required for regional indicators is difficult to achieve given a) a lack of clarity in terms of what the indicators intend to measure and for whom/what purpose, and b) a lack of consistency across states in both the type of data collected and methodology used. Moreover, the Steering Committee has observed that inconsistencies among state data collection strategies reflect the flexibility granted state agencies in conducting their environmental management programs. The NEGIP Steering Committee recommends that national indicator development consider the necessary balance between the important data (and programs) unique to each state, and the need for a national picture of environmental performance. An appropriate balance may be achievable through further discussions that result in a regional consensus on a small number of consistent "core measures" for New England for which the intended use is clear. Other data collection and/or indicator development efforts could continue to enhance information amenable to state-specific program management and communication needs. The Steering Committee also suggests that careful data screenings are valuable in developing and implementing useful indicators, and that a collaborative regional process proved effective in considering indicator development.

--------------------------------------------------------------------------------

I. Background
In 1995, representatives from the six New England state environmental management agencies and EPA-New England began collaborating on a menu of environmental indicators that would measure (1) the status and trends of the quality of the New England environment and (2) program accomplishments toward reaching state and regional environmental goals. Over the course of that first year, the New England Environmental Goals and Indicators Partnership (NEGIP) involved over 100 state, federal, and other interested participants through two rounds of workshops to identify and evaluate potential indicators for air, water, waste management, and ecosystems in the region. The NEGIP Steering Committee used the information collected as the basis for reaching agreement on an initial regional menu of 23 environmental indicators that all six states and EPA agreed were worthwhile and that data currently existed to support them. The Steering Committee also identified nearly 100 other potential indicators that were worth further examination, but not yet regionally supportable by current data collection efforts.

While it was clear that all six states could report, for example, "the percent of rivers and streams supporting designated uses for fish consumption," the Steering Committee recognized that this was not necessarily sufficient for developing reportable indicators at the regional level. For an indicator to have meaning at a regional level, there needs to be consistency in both data sets (e.g., in the same units, over the same time period, etc.), and methodology. Do all states have the same criteria for fish consumption? Do all states even assess waters of similar character (e.g., some may target waters with specific pollution sources)? The Steering Committee felt that these and similarly detailed questions about the data in each state would be crucial to evaluating the potential for the New England states to report on a sample set of measures from their original menu.

The Steering Committee hypothesized that implementing a set of regional "core" measures would be challenging and sought understanding of specific challenges for New England and the implications for the development of useful regional measures. Toward this end, NEGIP work in 1998 focused on a "data screening" effort guided by three general questions:

1. Can each state report a particular indicator?
2. How can each state report a particular indicator? (e.g., what units, time periods, coverage)

3. If reported, what does a particular indicator represent in each state, and is this consistent across states in the region? (e.g., does "pounds of waste landfilled" include imported waste? In each state?)

What started as an effort to build regional capacity for developing and using indicators has taken on greater significance as measurement issues within the NEPPS and GPRA contexts1 are identified and debated both nationally and within individual state-EPA negotiations. Key questions surrounding the development and use of appropriate Core Performance Measures (CPMs) have risen to the forefront of EPA-state relations in recent years. Recognizing that the groundwork laid in New England might inform these discussions, the NEGIP Steering Committee agreed to revise the language of some of the original NEGIP indicators to be consistent with similar CPMs. In most instances the Steering Committee deferred to the language of the CPM. Thus, the findings and recommendations contained herein are based on an evaluation of data availability and quality for twelve example indicators including six specific Core Performance Measures.
It is important to note that the Steering Committee selected its indicators for analysis based on the assumption that all six states and EPA maintain data to support those measures. The 12 indicators selected for data screening are not intended to imply a comprehensive measurement system nor a NEGIP endorsement that they are in any way "recommended" indicators. The group envisioned this short list of air, water, waste, and ecosystem indicators as an experiment from which lessons could be drawn to enhance future indicator development.

This summary report contains the key findings and implications of the data screening process, along with the recommendations of the NEGIP Steering Committee for moving forward. The detailed results of the data screening for the 12 selected indicators are contained in Appendix A, Indicator Data Catalog: An Evaluation of Data Issues Related to the Development of Core Performance Measures and Regional Environmental Indicators.

--------------------------------------------------------------------------------

Footnote 1. National Environmental Performance Partnership System and Government Performance and Results Act

II. Process
Beginning with its existing menu of 23 potentially regional indicators, the Steering Committee selected a subset of 12 measures covering air, water, waste, and ecosystem issues. The group then compared this list to similar EPA Core Performance Measures with the goal of using CPM language wherever possible (i.e., where a CPM was close enough in meaning to a previously selected NEGIP indicator). In some cases, the Steering Committee retained the original NEGIP indicator language. In others, the language of a similar CPM was substituted. And in a few instances, indicator language was modified to clarify what the Steering Committee intended, or to encompass the meaning of both the NEGIP indicator and a related CPM.

With the assistance of the Green Mountain Institute for Environmental Democracy (GMI), NEGIP enlisted the cooperation of approximately one hundred individuals in state environmental management agencies, state health departments, EPA- New England, and EPA-Headquarters, to identify the specific characteristics of the data collected by each agency that would support the example indicators (e.g., dates available, spatial coverage, quality, comparability, methodology). Steps taken in the data screening process are described below.

May 1998 Verification, reconciliation, and refinement of indicators
As described above, the Steering Committee and GMI met to consider the original indicator list. The Steering Committee refined the language of each indicator to clarify its meaning and to correlate with similar CPMs.

June 1998 Identification of data and staff
The Steering Committee identified staff within their agencies to help document existing data sets, and the specific ability of these data to support the indicators. SC members furnished GMI with lists of contacts and phone numbers.

June 1998 Development of screening tool
GMI and the SC developed a screening survey, tailored to each indicator. The surveys posed questions about the data available to support the measures, including temporal and spatial coverage, methodology and quality, and availability. The "screening tools" were reviewed by SC members and their agency staff, revised, and distributed to the designated contacts prior to phone interviews.

June - August 1998 Screening of the data

GMI contacted the designated staff, and conducted brief phone interviews based in part on the questions from the screening surveys.

August - September 1998 Production of data catalog

GMI reviewed the results of the interviews, and compiled a catalog of data available for each indicator, for each state and EPA - New England. GMI summarized and compared the availability, quality, and consistency of data for each indicator across the region, focusing on a) whether or not all the states could support the indicator, and b) how similar in quality/methodology the data are among states. (See Appendix A: Indicator Data Catalog).

September 1998 Discussions of the implications
The Steering Committee reviewed and discussed the findings of the data screening, its implications, lessons learned, and potential next steps based on substantive findings for New England.

November 1998 Recommendations
The Steering Committee considered recommendations based on the lessons learned from this process. Final recommendations are offered on suggested next steps for New England, the ongoing CPM development process, and for indicator development generally.

III. Findings: Potential challenges and key observations for the 12 indicators
In each of the 12 examples, the data screening revealed specific challenges for reporting on the selected metrics at a regional level. In general, subtle differences among states' definitions, criteria, and/or data collection methodology can have profound impacts on the ability to use state data sets to build meaningful regional indicators. This challenge is compounded by the often vague or ambiguous wording of the measures, making assumptions about the intended use of the information difficult.

IIIa. Unclear indicator language
When the wording of the indicator does not convey the specific parameters to be measured, much is left to interpretation. Without this explicit guidance each state will report the measure in a way that is most useful for state program management or communication purposes. For example, the air indicator "Trends in ambient air quality, for each of the 6 criteria air pollutants" does not prescribe the desired statistical aggregation, nor does it explicitly guide the selection of which air monitors to include in the data set. Representing air quality as average concentrations of ozone over time, reflects long term, general trends in air quality. It may suggest whether levels of ozone are increasing or decreasing, but is not particularly useful to a decision maker concerned about how often air quality exceeds national standards. On the other hand, reporting daily maximum concentrations will reflect more dramatic variations than average concentrations. This indicator might not be useful for communicating general trends due to its sensitivity to climatic and other variations, but might better inform a decision maker about whether an area is experiencing ozone levels close to national standards - crucial information for an area concerned about possible non-attainment status.
Thus, states may vary in their interpretations of the same indicator, depending on the program decisions they need to make. Achieving consistency in interpretation is unlikely absent a specified use for the indicator. The way an indicator such as "trends in ambient air quality¼ " is operationalized (which statistics are used to represent ambient air quality) will affect how well it addresses its intended use.


IIIb. Variability in definitions and criteria
Other indicators are more prescriptive than the air example above. Yet they still contain terms for which states have varying definitions. "Population served by public surface water with state-approved source protection programs / population served by public surface water systems" gives a fairly clear sense of how the indicator is to be computed, but does not include an explicit definition or criteria for what constitutes a "source protection program", other than that it be "state-approved." Thus the indicator is sufficiently prescriptive, but still subject to variability in state criteria for source protection program approval.

Another often cited example is the indicator based on the 305(b) data, "Percent of assessed waterbodies that protect public health and the environment by supporting [designated uses]", a measure prescribed in law, and described in extensive federal guidance. However, the law has also built in flexibility by requiring states to define their own criteria for water quality and thus variability in what constitutes an "assessed waterbody" or how support of "designated uses" is determined.

IIIc. Variability in data collection methodology
For many indicators, even if similar data sets exist in each state (e.g., same units, aggregation, time), the methods used to acquire the data may have implications for comparability across states. For example, data exists in most states on solid waste. The indicator "amount of waste recycled, landfilled and incinerated" is fairly prescriptive in its language (although "amount" may be defined as either weight or volume, commonly reported data suggest that weight is the more standard metric). Yet two states reporting the same indicator - waste, in pounds, landfilled - may be telling very different stories, depending on where the data comes from. The indicator does not prescribe the methodology for reporting the amount of waste. The amount of waste reported by a hauler, does not necessarily reflect the total amount of waste that is landfilled. A landfill might import waste from out of state, thus contributing to waste landfilled, yet unaccounted for by in-state haulers. If the data used to support the indicator is from a disposal facility, it may not account for waste that is generated in-state, but hauled out-of-state.

While Rhode Island reports nearly all waste data from its central landfill, most states have available a mix of data from disposal facilities, haulers, and transfer stations. This affects the quality of the data (likely to be double-counted or unaccounted for waste) within a state, and suggests that the same waste indicator may mean very different things when reported across the region.

While the screening was extensive, it was by no means exhaustive. Fully cataloging the complexities and details of each agency's data sets was beyond the scope of this project. What the Steering Committee has learned by bringing agency data managers and program staff into a discussion on indicator development, is that even subtle differences in the supporting data can have a profound effect on the meaning of the story an indicator tells. Even for twelve indicators that were supposed to be reportable by all the states, the Steering Committee identified numerous examples of significant variability in the data. The Steering Committee expects that it is equally important to understand the implications of unclear indicator language, variability in definitions and criteria, and variability in data collection methodology, in the development of other indicators.

IV. Observations on multi-state indicators and the NEGIP process

IVa. Balancing the valuable uniqueness of each states information and reporting mechanisms with the need for consistent measures to paint a regional or national picture, involves tradeoffs.
Rhode Island has water quality data to assess all watersheds (basins) within the state, for reporting under the requirements of section 305(b) of the Clean Water Act. Vermont assesses a proportion of its more susceptible waterbodies in a typical reporting cycle. Both states have data available to report the "percent of assessed waterbodies that support designated uses." Yet obviously the same indicator has very different meanings in each state and combining the information or reporting the two percentages side by side would be misleading. A thorough understanding of the supporting data sets is necessary to understand the state contexts in terms of how the data is collected and what it represents. In some cases this understanding might help identify a "common denominator" that does not require or imply consistency in how waterbodies are assessed (e.g., "percent change/improvement in meeting designated uses"). While this indicator would be comparable and accurate, the richness of each states' data sets can get lost in the translation. Thus in seeking a "common" measure, the value of that measure for informing state-specific management decisions is in some cases, potentially compromised.
If the states and EPA agree that there is value in a list of measures for national environmental conditions and trends in core program areas, then both parties might collaborate on a list that meets the requirements for national measures, i.e., find the common denominators. Another option would be for EPA to focus on making use of existing national databases for a limited set of measures, while joint discussions focus on developing negotiable measures for consideration in Performance Partnership Agreements. These negotiable measures could be crafted to utilize the state-specific data collection efforts and methodologies that in some cases are potentially more relevant and useful for informing state management efforts than measures that represent the "least common denominator ' among all states. Such measures would not require the extensive regional process necessary for finding "common denominators" among multiple states, and would likely provide the basis for valuable discussions between states and EPA on utilizing state and national data for better environmental management.


IVb. Data availability, not environmental goals, are driving the development and use of core measures in the short term.
Like the indicators selected for the NEGIP menu, the current set of Core Measures is driven primarily by the presumed availability of comparable data sets across the states. Based on the New England experience, only a small number of indicators are likely to be consistently reportable at this time, and only then with a significant investment to understand the data and make the implicit decisions to make them reportable. Indicators at this stage in the game are at best only indirectly linked to environmental goals and objectives. For example, the indicator "Amount of solid waste landfilled, recycled, and incinerated" describes a general concern for the amount of waste that must be managed in a state, regionally, and/or nationally. However, it does little to inform discussions regarding the explicit goals and objectives of EPA or states regarding waste - most of which are likely similar to EPA's management-oriented goal of "better waste management, restoration of contaminated sites, and emergency response". While this situation is not likely to change in the near term, working toward "better" indicators-those that are linked with environmental goals and objectives, as well as to other measures representing the larger context of the environmental quality/management story-is desirable in the longer term. Understanding the monitoring strategies and rationales behind the data sets intended to support Core Measures is the first step toward transforming existing data collection systems to support better, more meaningful indicators.

IVc. These were supposed to be the easy ones.
The original NEGIP menu of indicators was the result of a lengthy regional process involving both policy and technical program staff from each of the six state agencies and EPA-New England. The 12 indicators selected for the data screening represent the overlap between the regionally developed NEGIP menu and the nationally developed Core Performance Measures. Thus, the Core Performance Measures evaluated were those that NEGIP groundwork had suggested were in the best shape for reporting on a regional or national basis. Nevertheless, inconsistencies in the way states define key terms and differences in monitoring strategies prevented a clear understanding of how to report the indicator in all 12 cases. Once a state or EPA agrees on measuring, for example, "trends in criteria air pollutants," several steps must still be taken to develop explicit, operational language for a reportable indicator, such as "trends since 1995 in ambient concentrations of each criteria air pollutant, calculated as the average daily concentrations from all monitoring stations, in parts per million." The sizable gap between identifying a measure for which there is regional (or national) agreement and developing a reportable indicator suggests that overall, much is left to interpretation. In the interest of both consistent and useful measures, this interpretation is difficult absent clear guidance on the measures and the context for their use.

IVd. New England's regional process has proven helpful both in the identification of key measurement issues and in enhancing the cooperation necessary for successful resolution.
The original menu of NEGIP indicators resulted from iterations of technical staff discussions and multi-agency policy level discussions. Regional indicator workshops brought together state and federal agency program staff, with academics and other interested parties, and with agency policy makers to create better communication networks and a structure for partnerships. This has helped achieve the necessary buy-in at a number of levels across agencies and program areas. By creating a home-grown menu of indicators, the inevitable challenges in making them reportable are addressed in the context of previous collaboration rather than as a top-down process.

IVe. The public is an important audience and participant for regional indicator development
Over the last three years NEGIP has struggled with questions of public involvement in indicator development. It is difficult, and often misleading, to attempt to represent a complex environmental issue with a single indicator, even for the purposes of getting feedback on the understandability of the indicator itself. Yet it may not be realistic to obtain meaningful public input on a more comprehensive measurement system. There is clearly a need for more thoughtful discussion on an appropriate and realistic role for the public in the development and use of Core Measures.

V. Recommendations
Va. New England should take the next steps toward a set of reportable regional measures.
New England has invested a great deal in learning what it means to develop regional indicators. This data screening exercise represents not the end product but an interim step. The questions raised and lessons learned from this exercise should be used in convening a collaborative process for developing reportable indicators for the region, as well as in assisting states with the development of measures that contribute to their own performance management systems. Reportable indicators are those that the region is in the best position to report on in a consistent and meaningful manner-these are not necessarily the "best" indicators because as noted above, in the short term reportable indicators will be driven by data availability rather than environmental goals. This effort would include identifying (a) the decisions that need to be made to arrive at a consistently reported and interpreted set for the region, and (b) groups to serve as decision makers for each indicator. Steering Committee members have held preliminary discussions and agree that for the most part, these decisions are feasible, but need to include the commissioners, program managers and staff, and data managers.
It would also be valuable to undertake a similar screening process for a broader set of the Core Measures. If these are to be pursued, it can be assumed that they will prove at least as challenging as the 12 examples in making the transition to reportable indicators. Again, the identification of the characteristics of specific data sets is a prerequisite to moving from the language of the Core Measures to reportable indicators.

Vb. More work is needed to clarify the reasons for, use(s) of, and audiences for CPMs.
Both in the region and nationally, there is a strong need to better articulate the reasons for, and uses of reporting Core Measures (e.g., national picture of environmental trends or measuring effectiveness of programs). In cases where an indicator could be presented several ways, each addressing a different information need, an understanding of the available data alone will not determine what specifically to report. The best indicators are reported so that they provide the most relevant information to the audiences or decisions they are supposed to inform. The structures of statutes and agency programs offer some guidance on what kinds of information these indicators are intended to provide and what decisions they are intended to inform. However, more work is needed to articulate the linkages among agency activities and objectives and environmental goals, as well as what the public wants to know about environmental conditions and program accomplishments. Understanding this system of linkages among activities, desired outcomes, and desired information may help identify what information is needed to make decisions or communicate results , and help move towards better indicators in the longer term.

Vc. Balance attention given to developing realistic Core Measures in the short term with attention to longer term development of a measurement system.
Efforts aimed at improving the measures in the short term, based on making the best use of available data, should not compromise a longer term vision of Core Measures that are linked to articulated environmental goals and objectives within an integrated management system. Recognizing the difficulties inherent in using data sets designed for different purposes in various states to support a single consistent measure, it is important to keep realistic expectations in the short term. NEGIP's regional pilot suggests there may only be a handful of measures that may be comparable or useful at the national level. Efforts to refine and improve the short-term list will be more efficient with a small, manageable number of measures that will grow both in size and relevance as we move toward a better set that link with goals and objectives.

Developing a system of measures that represents the complex system of management activities and environmental performance is challenging, particularly when the development of measures is guided by existing data sources and pre-existing program reporting. In the short term, it may not be realistic to expect a set of measures that comprehensively capture the complex linkages among activities and outcomes on a national scale, yet this appears to be a goal for the longer-term progress of the national environmental reporting effort. If so, the development of environmental information systems like NEPPS should continue to improve the use of indicators as systems of information that are tailored to environmental reporting and management needs. Various frameworks have been developed that provide direction for developing sets of indicators that tell a meaningful story about a particular issue or program to organize the information in a manner that informs agency-level decisions. Toward this end, NEGIP has piloted a process and framework for the collaborative development of regional indicators, using mercury as an example issue. The resulting conceptual framework (in this case, Pressure-State-Response) illustrates how a comprehensive indicators system might be used simultaneously as a reporting tool and a management tool for achieving goals and objectives.2

Vd. The regional approach to the development of Core Measures should be replicated in other regions.
The New England experience has shown that a well executed indicator development process can enhance the chances of successfully using the product. The need for buy-in among technical and program staff, as well as upper management, cannot be overstated. Providing a regional forum for discussing common concerns serves to pool our resources for addressing
those concerns in a way that makes sense for the region as a whole. It has also enhanced communication within and across agencies and brought about a unified regional consensus on measurement issues being discussed nationally. The regional process has also provided for the exchange of experiences with indicator development in the various states. Regions or
other multi-state groups should consider facilitating the ongoing transfer of indicator development experience to enrich the state and national discussions.

These benefits have come as a result of a considerable investment of time and money, and the commitment of individuals at the policy and technical levels to collaborate in new ways. Forums such as Steering Committee retreats and regional workshops have merged technical and policy discussions with worthwhile results. It might be more difficult to have these types of substantive discussions at a national level. An alternative approach would be to foster multiple regional discussions similar to the NEGIP process, reach agreement on regional measures, and then seek commonality among regions.
--------------------------------------------------------------------------------Footnote 2. For a summary of NEGIP's mercury pilot and the conceptual framework that was developed, see Developing Indicator Systems for Issues of Regional Concern: Recommendations for a Collaborative Process. Green Mountain Insitute for the NEGIP Steering Committee, 1999.

Last modified on April 18, 2005
GMIED
75 Clarendon Ave.
Montpelier, VT 05602
(802) 229-6072
gmied@gmied.org