Response to reading

User Generated

Rzn00

Health Medical

Description

we have to come up with either a question or comment on each article after reading them. the comment should reflect full understanding of the article. The question has to shows a critical thinking of the issue presented in the article.

Unformatted Attachment Preview

442445 12 EVI18210.1177/1356389012442445Laperrière et al.: A socio-political framework for evaluability assessmentEvaluation Article A socio-political framework for evaluability assessment of participatory evaluations of partnerships: Making sense of the power differentials in programs that involve the state and civil society Evaluation 18(2) 246­–259 © The Author(s) 2012 Reprints and permission: sagepub. co.uk/journalsPermissions.nav DOI: 10.1177/1356389012442445 evi.sagepub.com Hélène Laperrière University of Ottawa, Canada Louise Potvin University of Montréal, Canada Ricardo Zúñiga University of Montréal, Canada Abstract Jointly conducted with a coalition of HIV/AIDS community-based organizations (CBOs), this evaluability assessment sought to better understand the factors that affect the feasibility of a participatory program evaluation to be undertaken in partnership with the CBOs’ nongovernmental-organization members and public-health agencies. Participatory evaluations and partnerships are grounded in social and institutional authority structures that unavoidably influence researchers and evaluators. The construction of a theoretical framework for sociopolitical evaluability assessment of participatory evaluations is a necessary precondition for the coalition’s members to engage effectively in evaluation research with other partners. Keywords civil society, community-based organization, evaluability assessment, participatory evaluation, partnerships, program evaluation, public health, sociology of organization Corresponding author: Hélène Laperrière, School of Nursing, Faculty of Health Sciences, University of Ottawa, 451, Smyth, Ottawa (Ontario), Canada K1H 8M5. Email: helene.laperriere@uottawa.ca Laperrière et al.: A socio-political framework for evaluability assessment 247 Introduction Socio-political analysis prior to participatory evaluation Participatory evaluation seeks to bring together multiple actors from different organizations and diverse points of view (Cousins and Whitmore, 1998; Guba and Lincoln, 1989). Its aim is to ensure program appropriateness and accountability to community groups (Springett, 2001). Participatory evaluation can be pluralist, practical and even emancipatory (Ridde, 2006). The methodology can foster a learning environment (Suarez-Herrera et al., 2009). Nonetheless, House and Howe’s (1999) critique remains relevant: power relations among the stakeholders may jeopardize the inclusion of all values in a deliberative process. Not all the voices brought together in the participatory process are always able to be heard and to orient actions through the technical process of democracy (Plottu and Plottu, 2009; Wendhausen, 2002). Moreover, the traditional use of consultations as a democratic process for decision making can be a participatory device with the potential to favour those who have formal decision-making power (Burton, 2009). In public-health programs that require partnerships between the State and civil society organizations (CSOs), the evaluation process is often characterized by diverging perspective vis-a-vis the requirement of evaluation. This is true even when evaluation is conducted with a participatory perspective. Typically, State agencies conceive program evaluations as an exercise with the potential to improve programs and subsequently population health while community organizations tend to perceive it as a nuisance that interferes with action. When confronted with such opposite and diverging options, Friedberg (1972, 1997, 2000, 2005a, 2005b) argues that sociology cannot provide solutions: it defines the problem. Theoretical perspective In Friedberg’s sociology of organization, organized action can be analyzed according to three dimensions. One dimension is the will or managerial discourse. It represents the ideal organization, the strategic direction and the project that orients decisions and actions. Another dimension is made up of formal or apparent structures representing the materiality and objective features of an organization, such as organizational charts, buildings, and so on. The third dimension consists of deep or informal structures produced by social interactions. They represent the human side of the enterprise and the organization's real structure. The deep structures are difficult to appraise because they are not formalized in official discourse and written documentation. They remain in a State of ‘opacity’ (March, 2004a, 2004b) that Friedberg calls ‘clandestine management.’ Few studies have explored this third dimension with community-based organizations. The present study, jointly conducted with CBOs, seeks to make those structures apparent in participatory program evaluations and to propose a theoretical framework for a socio-political evaluability assessment. Friedberg (1997) views organizations as concrete systems made up of relatively autonomous actors linked by strategic inter-dependences and who maintain cooperative relations. He conceptualizes power relations as the potential influence that organizations can have in the negotiations and the formalization of rules. In this view, the relationship among the three dimensions presents two faces of action (Friedberg, 2005a): prescription and practice. The prescriptive side consists of the rule makers who, through a formal set of rules, dictate ‘what must be.’ The second facet is found in the deep and informal structures of praxis or social-actor practices. Social actors base their actions on the opportunities and constraints encountered in apparent structures, which are 248 Evaluation 18(2) contaminated by the logic of deep structures. The three dimensions are always present in all levels of the organization; however, each dimension is relatively autonomous and develops at its own pace and with its own logic. In this article, we argue that an exploration of the interplays between these three dimensions of organized action is beneficial to participatory evaluation. In line with the ‘expansion’ of the concept of evaluability assessment (Beaudry and Gauthier, 1992; Leviton et al., 2010; Meeres et al., 1995; Mercer and Goel, 1994; Thurston and Potvin, 2003; Thurston et al., 2003; Wholey, 1981; Wholey et al., 1994), we analyze socio-political evaluability assessment as a necessary component of the pre-evaluation process. We began our work using Friedberg’s theory of organization and then developed it through an examination of the first author’s fieldwork within the framework of a participatory evaluative research project. Taking conceptual guidance from Friedberg’s views on formal and informal organizational structures, we articulated our conceptualization through the examination of the inter-organizational partnership relations between State-run public-health agencies and civil society. The context of a coalition In Canada, with the rise of partnerships between the State and civil society within the network of federal and provincial public services, CBOs have expressed a desire to actively participate in defining evaluation requirements that are associated with the funding programs that provide them with resources (ARPEOC, 2005). The program related to this study was the AIDS Community Action Program (ACAP), a federal program based on partnership with community organizations (Public Health Agency of Canada, 2004, 2006, 2009). An intentional sample was made up of the 37 diversified groups related to the Coalition of community organizations against AIDS in Quebec, Canada (COCQ-sida). The members of the Coalition have a long record of participatory evaluation that dates back to the early 1990s (see Jalbert et al., 1997). The first HIV/AIDS CBO evaluation tools were developed with a view to situating evaluation as a part of the collective action. These tools included assessments from diverse actors directly involved in HIV/AIDS community actions. The ‘Shared Appreciation Group’ – a term coined by a community member – was proposed as a way to give voice to CBOs. Each group of actors in a CBO is invited to express its point of view regarding the CBO’s mission and collective actions. These groups are typically: board members, staff, users, volunteers and partners. This initial work resulted in Epsilon. A Self-Evaluation Guide for Community Organizations (Jalbert et al., 1997), published in French and English. A second guide specifically designed for staff and volunteers followed the first initiative (Zúñiga and Luly, 2005). Within this context, we define socio-political evaluability assessment as an analysis of the conditions involved in participatory evaluation processes. Evaluation is more than merely applying preconceived data-collection methods and an interpretive canvas. Given that participatory research has incorporated a public-health context of alliances (Mantoura et al., 2007), it ipso facto incorporates structural and power relations (positions, action capacities, social constraints) that both interfere with and are relevant to the analysis. Method Between 2004 and 2007, the first author conducted volunteer work that led her to participate in: (a) the Provincial Coalition General Assemblies; (b) the collective construction of evaluative tools for Laperrière et al.: A socio-political framework for evaluability assessment 249 interventions with members of CBOs; (c) a networking meeting of knowledge sharing expertise with regional HIV/AIDS CBOs and national public-health agents; (d) informative workshops and training about the HIV/AIDS Coalition community evaluation guide; (e) collaborative selfevaluation activities with the Coalition based on the Community Evaluation Guide with another coalition of groups working with young peer educators; (f) volunteer activities with people living with AIDS; and (g) provincial HIV/AIDS academic conferences. The evaluative research project began with an attempt to understand the logic of their interrelationships and then their relationships with the public-health agencies, which act largely as funding bodies. Four sources of information were mobilized in a continuum between direct action and intellectual organization: • • • • a critical review of the literature; collective discussions with actors involved in the community organizations as a device for reflexive practice; direct participation in developing and testing evaluation tools; and volunteer work in collective social-support action and supporting CBOs users. The formal research and participatory insertions in five HIV/AIDS CBO members of the Coalition comprised over 500 hours, which included more than 400 hours of volunteer participation and more than 100 hours of participatory observation. There were 105 visits linked to volunteer participation and 29 visits linked to other activities. Activities were undertaken with five Coalition member CBOs in urban, semi-urban, urban-rural, and metropolitan regions. The research protocol included 13 group interviews with a total of 47 participants. The group interviews focused both on the HIV/AIDS CBOs’ activities with people living with HIV/AIDS and the possible ways of evaluating their activities at the local level and bridging the distance between funding sources, the requirements of abstract models of evaluation, and the real operating conditions of community groups. The coalition tool ‘Strategic Analysis of Strengths and Weaknesses’ (Zúñiga and Luly, 2005) was used to explore the reconstruction of the community group’s history (experiences, competencies, routines, fieldwork) in order to visualize the ‘project’ as a realistic projection into the future. It looked at internal relationships (organizational dynamics, atmosphere of cooperation, administrative control, forms of participation, etc.) and external relationships (interpersonal dynamics, information, contacts, alliances, network, etc.). The data reduction, display and conclusion drawing/verification overlapped during the data collection to make up the analysis (Miles and Huberman, 1994: 11–12). The fieldwork took place in the daily life of HIV/AIDS CBOs. The results from direct observation were gathered in an auto-ethnographic journal (Ellis and Bochner, 2003). It embedded the role evaluator/ researcher in a cooperative relationship with the Coalition, the CBOs and people living with AIDS. Continued presence in the field as a volunteer allowed the early presentation of results to the coalition groups’ member participants. They were asked to read the notes, comment on the analysis, and suggest further interpretations. They decided which results should be made available in the general coalition assemblies and publications. The group interviews raised a number of factors: (a) the relative autonomy of the decision-making process; (b) recognition of different levels of participation in negotiating objectives and conditions of success; (c) the impact of inequalities on evaluation outcomes; (d) the cultural diversity of inter-organizational partnership settings; and (e) the presence of internal colonization. Other publications detail these conclusions (Laperrière, 2009a, 2009b). 250 Evaluation 18(2) Guided by the prolonged volunteer engagement and proximity, the first author drafted an early version of the theoretical framework. She constructed the framework’s ‘assemblage’ (Latour, 2005) out of conversations, individual and collective discussions with coordinators and meetings with people living with HIV and permanent members of the Coalition. She inductively conceptualized the pattern of relationships among the actors involved. Based on French theories of sociology of organization, it was gradually modified and remodelled as the fieldwork progressed. The meanings that emerged from the data, both from observations and group interviews, were tested for their plausibility and their verifiability during the frequent exchanges with HIV/AIDS community groups and the Coalition’s coordinating body. The final version was completed after the research immersion. Results Social relations of visible and invisibles structures: Direct and indirect influence In 2003, initial exchanges with the Coalition network highlighted that members needed a better understanding of the status and relationships of all members, as members were variously directly and indirectly involved in the community group’s social action. This was exemplified by the following verbal comment in one General Assembly Meeting: ‘We cannot talk freely; we have spies among us.’ Some CBO members commented that tensions were becoming apparent among various groups of actors involved in HIV/AIDS interventions, for the most part between public-health agencies (State) and CBOs (civil society). The tensions were commonly understood as the consequences of the divergent expectations held by the State and the civil society representatives. These tensions emerged when they tried to agree an evaluation process that would involve participation and partnership (Zúñiga and Luly, 2005). It appeared that these groups had discrepant ways of defining evaluation: • • Representatives of funding agencies valued the assessment of the achievement of formal objectives stemming from contractual relationships; Representatives of community organizations valued both the urgency of trying to respond to the assessed needs of users, and program actors who related the program objectives to the contextual conditions. Direct participation had an impact on the development of the Coalition community-based evaluation guide (2004–5), as follows. The Coalition and its members expressed concerns about evaluations that were undertaken by public-health agencies. The daily pragmatic activities of program actors and volunteers to meet user needs often went unnoticed by national program evaluations. Community efforts were often seen as redirecting energy toward other needs rather than meeting the explicit and agreed-upon objectives of the national program to deliver services to HIV-infected people. Community actors, however, felt that they had to work beyond the mandate for which the funds had been allotted. An ‘experimental mode’ (Laperrière, 2009b) was perceived as hindering their capacity to work with the unpredictability and uncertainty of daily interventions for people living with HIV. Discussions within groups revealed the need for a collective understanding of the social and political structures, before undertaking participatory evaluation. Two research questions were Laperrière et al.: A socio-political framework for evaluability assessment 251 presented at the Coalition’s annual general meeting in March 2005 and reformulated on the basis of discussions with the HIV/AIDS CBOs involved in the research: (1) How do social factors influence the feasibility (evaluability assessment) of the evaluation (which is anticipated by the public-health program and to be undertaken in partnership with community organizations)? (2) How does the relative autonomy of high-level decision makers impact: • • • their participation in negotiating the objectives and conditions of success, inequalities in implementation strategies, and the cultural diversity of local and micro-local settings? In order to explore these questions, the research goal was reframed as an analysis of the formal and informal inter-organizational structures in the establishment of partnership and decision making regarding evaluation. Based on an earlier study (Laperrière and Zúñiga, 2006), the starting point was therefore the need to identify the actors involved (users, staff, administrators, public-health agents, coalition, others) and to make visible the power relations and vertical control influences in their relations. An initial model, made up of four features related to the actors involved in the participatory evaluation process, started to emerge. The four features were: • • • • the actors’ identity with regard to the program; their relative positions of domination (power and control over the actions of others) or complicity with other actors; how power was exercised (physical and symbolic violence, direct or indirect influence, whether vertical relations encompassed more or less areas of control than entailed by hierarchical control) and attendant processes of imposed socializations; the response to those imposed socializations. Through group interviews with the CBOs’ participants, we discovered that the reality of the relationships was much more complex than the expected, formal structure, which is represented in Figure 1a. The national public-health program is conceived as a ‘social totality’ whereby the same or very similar actions take place in all the relevant CBOs (targeting users with HIV/AIDS and people at risk). This view of things implies a linear program transfer, which is in contradiction with the local actors’ expectations of continuous and more egalitarian exchanges with public-health agencies. Figure 1 illustrates the distance between the ‘ideal perception’ of social relations and a more realistic one (Figure 1b). In the logical frameworks of the national program, deep, informal relations are relegated to an anecdotal status. Doing so creates an impression of order and subsumes complex specificities within the idea of homogeneity. This impression enhances the belief of the equality of all the organizations in a partnership (CBOs, Coalition, and publichealth agencies). Thus, idealistic representations camouflage ‘deeper issues’ in order to avoid the inclusion of indirect and messier relations of informal influence. These relations could be out of step with official policy discourses of harmony and homogeneity. The public discourse obscures the power relations of influence favouring indirect actors such as public servants or politicians. Alliances reduce the diffusion of awkward incidents, which are nevertheless present in informal hallway conversations. 252 Evaluation 18(2) Figure 1a. The expected formal structure: A derived conception of the relations between actors Figure 1b. A realistic description of the formal and informal relations between actors Figure 1. Social relations of formal and informal, superficial and deep authority: direct and indirect influence The more involved the researcher became within the informal structures of organizations during volunteer work, the more pertinent it became to include the influence of outside actors in the analysis. In schematic terms, working with autonomous actors in collectives like the CBOs requires a shift from a perception dominated by totality, unity and hierarchical structure (see Figure 1a) to one that grasps and takes into account the underlying and theoretically unruly everyday relations (see Figure 1b). The Coalition did not operate in vertical decision-making organizational structures, such as is the case with bureaucracies typical of national publichealth agencies. A more realistic representation of the public-health program and its evaluation (see Figure 1b) would take into account the relations among the diverse partners and the indirect influence of other actors to maintain the partnership (administrators, politicians, civil servants, friends, etc.). A wide variety of actors was formally involved (senior civil servants responsible for coordinating national Laperrière et al.: A socio-political framework for evaluability assessment 253 program evaluation), co-opted (some members of CBOs) or marginalized (users of services other than those advocated by the public-health program, such as those focused on the plight of homelessness and poverty). In Figure 1a, all actors are subsumed under the upper levels of control, which unifies them and thinks and acts as their legitimate voice. This apparently tidy structure conceals the real differences in power and networking among actors. Formal structures transform interpersonal relations into impersonal claims: ‘The Ministry is studying the matter.’ These structures favour a discursive hierarchy consisting of two languages. The first language is the stilted, imposed and artificial one found in a program’s formulation. It is passively accepted as the expression of a formal authority. The second one (related to Figure 1b) is stigmatized as ‘anecdotal and colloquial.’ More suited to confirming the indirect influences of complicities and alliances, it is perceived as hypercritical, cynical and, therefore, a threat to unity. As a delinquent discourse, the second language has to be hidden under the gracious appearance of the language in which the program is presented. Evaluating under the tension between formality and informality A conceptualization of the socio-political framework for evaluability assessment emerged out of the group interviews and field observations. The participatory evaluation process usually seeks to connect so-called opposed partners (representatives of the national program and community groups). The frequent exchanges between CBOs and the Coalition led to a novel conceptualization (Figure 2) to better validate the expected and partly unforeseeable results of their joint actions. While the views of the State and of the community might appear to be opposed to one another, they are interlinked. At one pole (the State and the social), the action prescribed by the program (discursive authority) and the method (know-how) fit together. In this case, the local actions evaluated by a participatory evaluation would be in concordance with those prescribed by the national public-health agency. The image of a social totality (unified, structured, and hierarchical) would be the idealistic model, and the evaluation model would derive from this representation. At the other pole (civil society and the community), grassroots action is singular and differs from a predetermined method. In this case, the local actions evaluated by a participatory evaluation would be valued by autonomy, existential uniqueness and identity. The image of concrete singularity (diffused, unstructured, and horizontal) would be the idealistic model, and the evaluation would derive from this representation. The two poles illustrate the earlier concerns about participatory evaluation expressed by the Coalition members: the differences in perspectives between them and the public-health agencies. The progressive discussions showed, rather, a more complex network of interorganizational relationships between the poles. One view encourages the idea of a community that values the world of formal organizations and rigid evaluation processes. The other values a vision offering a close-up view of structures and objectivity as a progressively constructed consensus. The participatory evaluation process might act as if those beliefs can be brought together in a certain dialogue at a specific moment. Figure 2 reminds us of the complexity of the notion of participation. In the face-off between an informal network and an institutionalized conceptual structure, it is important to underline the contributions and expectations of both partners. For the informal network, participation means the recognition of a future partner as one with whom one shares basic values. The partner can be incorporated because it is trusted as ‘one of us.’ 254 Evaluation 18(2) Figure 2. The social tension of formality and informality in social action It is a known social entity and shares common interests. Participation and partnership have worked to render both understood and shared. For the formal totality, on the other hand, this sense of partnership can be difficult to incorporate. It is easy for a formal structure to think that participation is a contractual relationship based on conceptual explicitness. Both parts will have a clear idea of the obligations and benefits to be derived from the relationships in a contract. Solidarity means trust based on an experience of sharing. Clear-headed binding agreements take it for granted that when all questions are answered, trust will follow. Social totality masks a relationship traversed by tensions, in which mistrust of the unknown is the obstacle to sharing. It takes a concerted effort to realize that participation has to be built by strengthening both trust and clarity of purpose. Planning logic follows the taken-for-granted expectation that the project will only be Laperrière et al.: A socio-political framework for evaluability assessment 255 presented to potential participants for their acceptance. All decisions about problem definition, objectives, settings, and activities have already been made. However, the participants might subsequently feel that the formal agreement does not exactly encompass the actions they recognize as their own. Diversity in values, interests, strategies and behaviours will necessarily reappear in a participatory-evaluation negotiation process. Within each social reality and within each collective or individual actor, there is a never-ending tension between the attainment of a stable and unified totality and the respect of genuine diversity. A government is more than its head; a director is more than an all-powerful dictator. Neither organizational charts nor procedural protocols can suffice as a full response to future challenges. Formal structures mix with informal ones (Friedberg, 2000, 2005a). An authentic participatory-evaluation process that encompasses both government frameworks (here, HIV/AIDS public-health program) and civil society actions (here, HIV/AIDS CBOs) can be tempted to claim the autonomy of the inner member (here, the person living with HIV/AIDS). In this case, the singularity is conceived in terms of existential uniqueness, identity and autonomy (Figure 2). This movement constructs a subjective sense of belonging among those who share this perspective. In a partnership, the unofficial grassroots CBO partner network will ground its activities in its collectively constructed meaning, which will have to be shared with the official and institutionalized partner (State public-health agencies). The challenge here is to grasp social action and the ways in which it confronts both formal structures and informal social conglomerates. In participatory evaluation involving the State (including government agencies, semi-public and government-funded organizations) and civil society, the process is torn between a normative evaluation and a more communitarian one. The former tries to impose formal structures on what appears to be amorphous social agitation; the latter struggles to be an alternative that keeps alive the vitality of primary ties. Both perspectives have their corresponding challenges. The official viewpoint and its discourse must become understandable to their community partners; the community has the dual task of grasping its own experience and decoding the official discourse of their unavoidable interlocutor. Discussion Inter-organizational structures and meanings:The construction of an inclusive discourse Based on empirical observations and group interviews with a coalition and its members, we developed a theoretical framework for a socio-political evaluability assessment of participatory evaluation. This conceptualization highlights the need to take into account in a preliminary analysis values and interests, informal alliances and oppositions, and personality-tainted role definitions. A participatory-evaluation process that limits analysis to the parameters of formal rationality, formal structures and assigned roles, does so at the cost of impoverished results bound to lead to a sense of mutual misunderstanding. The commonly held view that formal structures are only a formality and that CBOs are characterized by a totally fluid atmosphere of unstructured and unspoken spontaneity is mistaken. Formal bureaucracies and informal groups must establish a dialogue precisely because they are not perfect examples of the ideal types they convey, and because ideal types cannot be transformed into concrete ideals of order versus harmony. A participatory-evaluation process would 256 Evaluation 18(2) benefit from including a collective identification of the evaluation themes, the methodologies underlying the choice of methods and the mechanisms to be set in place to conduct the evaluation. The resulting socio-political evaluability assessment will include the often-implicit cultural, social and political factors that affect the way in which the participatory-evaluation process is carried out in a partnership. Programming, evaluation and evaluability assessment are grounded in social and institutional authority structures within particular socio-political systems that inevitably influence the actors involved and their practices (House, 2004; House and Howe, 1999). This tension takes the form of pressure exerted from above by abstract theoretical conceptualizations generated by governmental and academic centres of excellence. Centres of expertise generally take it for granted that their conceptions are shared, consensual and univocal for all the actors involved in a proposed set of actions. They neglect to take into account the permanent distance between formalized and generalized collective entities (hierarchies, formal control structures) and demands for autonomy and for the primacy of the meanings of informal entities. Faced by these presuppositions, unofficial structures increase their power through implicit rules (Friedberg, 1997), cunning strategies and astuteness (Detienne and Venant, 1975). Bureaucratic efforts to control through rules that ignore the vitality of informal processes court their inevitable failure (Mintzberg et al., 1999). Ideal types of social organization are opposed only as ideas. It is necessary to go beyond the two ‘sectors’ in opposition: good community groups and evil bureaucrats. Field actors and policy makers are neither opposed nor mutually incompatible. The grounding in concrete local situations is never absent or forgotten by health-program planners and model-makers, and the meaning, aims and processes of implementation and evaluation is not absent in the thorough processes of ‘reflective practitioners.’ The criticism of the ‘either–or’ approach (Friedberg, 2000) should weaken the search for ‘ideal types’ (Weber), which could take dichotomies as empirical starting points. The necessity of local contextualization with personal and experiential knowledge Local contextualization and personal knowledge (Polanyi, 1964) enrich the understanding of a socio-political evaluability assessment by providing a better awareness of the actor network and the interrelations that constitute the participatory-evaluation reality. Rootman and Steimetz (2002: 9) stress the importance of focusing evaluation on day-to-day action by creating strong links between researchers and practitioners to better meet the needs of communities. In their view, these links are increased via the development of the ability of communities to conduct research and evaluations in order to use results jointly viewed as a database to increase the effectiveness of organizational, community and political health strategies. This consideration of the local is all the more important given that the socio-political network can be made up of formal and informal actors inserted in vertical relations of control and influence that can neglect a program’s unforeseeable aspects (Laperrière and Zúñiga, 2006) as well as the danger it entails for local actors (Laperrière, 2008). Programs that encompass a local perspective gain a better understanding of how they are integrated within the social world through social relations and local political debates, in particular between institutional organizations representing the State and civil society (community organizations). This production involves ‘the attentive observation of the scenes’ in which several actors or ‘agents’ of the local public-health space are active. To better understand what is going on, concrete insertion in the context is a prerequisite for taking into account local contextualization. Laperrière et al.: A socio-political framework for evaluability assessment 257 Jacob and Rothmayr (2009) argue for a better comprehension of the evaluation political context. They point out that the evaluation initiative derives mostly from managers responsible for the implementation of the program, the public administrators or the governmental agents for control and program accountability: ‘The one who calls the tune is the one who pays the tune.’ Public agents would reproduce the same political and administrative habits and procedures in which evaluative practices progress to an institutionalization rather than a space for civil-society participation in evaluation decision making (Jacob and Rothmayr, 2009). Conclusion The study focused on the civil-society perspective. We chose to strengthen the inclusion of local understanding in centralized planning, rather than to oppose them. The choice also arises from an ethical choice of justice and equity. CBOs possess fewer networks and resources to defend their perspective in the evaluation arena. We were able to discuss the implicit matters among us, which, otherwise, would have remained in the realm of informal private conversations. Moreover, we learned from the process of collective discussions and how to bring the political dimensions of participatory evaluation between civil society and State into the public sphere. Funding This research was made possible by the HIV/AIDS Research Program, Institute of Infection & Immunity, Canadian Institutes of Health Research (CIHR) Research Award (#IPD – 78574) as well as through a Complementary Training Award from Analyse et Évaluation des Interventions en Santé (AnÉis) Program for the first author. References Analyse et renforcement des pratiques évaluatives auprès des organismes communautaires – ARPEOC (2005) Analyse des pratiques d’évaluation dans les organismes communautaires [Research Report]. Montréal: Services aux collectivités de l’UQAM. Beaudry J and Gauthier B (1992) L’évaluation de programme. In: Gauthier B (ed.) Recherche sociale: de la problématique à la collecte de données, 2nd edn. Sainte-Foy: PUQ, 425–52. Burton P (2009) Conceptual, theoretical & practical issues in measuring the benefits of public participation. Evaluation 15(3): 263–84. Cousins JB and Whitmore E (1998) Framing participatory evaluation. In: Whitmore E (ed.) Understanding and Practicing Participatory Evaluation. San Francisco, CA: Jossey-Bass, 5–23. Detienne M and Vernant JP (1975) Les ruses de l’intelligence. La mètis des Grecs. Paris: Flammarion. Ellis C and Bochner AP (2003) Autoethnography, personal narrative, reflexivity: researcher as subject. In: Denzin NK and Lincoln YS (eds) Collecting and Interpreting Qualitative Materials, 2nd edn. Thousand Oaks, CA: SAGE, 199–258. Friedberg E (1972) L’analyse sociologique des organisations. POUR 28: 1–120. Friedberg E (1997) Le pouvoir et la règle. Paris: du Seuil. Friedberg E (2000) Going Beyond the Either/Or. Journal of Management and Governance 4(2): 35–52. Friedberg E (2005a) La culture ‘nationale’ n’est pas tout le social. Réponse à Philippe d’Iribarne. Revue Française de Sociologie 46(1): 177–93. Friedberg E (2005b) Qu’apporte la sociologie au management des organisations? Seminar at the Centre Humanismes Gestions et Mondialisation, Hautes Études Commerciales, Montreal, 27 April 2005. Guba EG and Lincoln YS (1989) Fourth Generation Evaluation. Newbury Park, CA: SAGE. House ER (2004) The role of the evaluator in a political world. The Canadian Journal of Program Evaluation 19(2): 1–16. 258 Evaluation 18(2) House ER and Howe KR (1999) Values in Evaluation and Social Research. Thousand Oaks, CA: SAGE. Jacob S and Rothmayr C (2009) L’analyse des politiques publiques. In: Ridde V and Dagenais C (ed.) Approches et pratiques en évaluation de programme. Montréal: PUM, 69–86. Jalbert Y, Pinault L, Renaud G and Zúñiga R (1997) Epsilon. A Self-Evaluation Guide for Community Organizations. Montreal: Coalition des Organismes de Lutte Contre le Sida (COCQ-sida). Laperrière H (2008) Evaluation of STD/HIV/AIDS peer-education and dangerousness: a local perspective. Ciência e saúde coletiva 13(6): 1817–24. URL: http://www.scielo.br/scielo.php?script=sci_ arttext&pid=S1413-81232008000600016&lng=en Laperrière H (2009a) Une pratique réflexive collective de production de connaissances dans la lutte communautaire contre le VIH/sida au Québec. Nouvelles Pratiques Sociales 22(1): 77–91. Laperrière H (2009b) Les inégalités entre le local et le national: le cas de l’évaluation qualitative de la lutte communautaire contre le VIH/sida au Québec. Recherches Qualitatives 28(3): 88–111. Laperrière H and Zúñiga R (2006) Sociopolitical determinants of an AIDS prevention program: multiple actors and vertical relationships of control and influence. Policy, Politics and Nursing Practice 7(2): 125–35. Latour B (2005) Re-assembling the Social. An Introduction to Actor-Network Theory. New York: Oxford University Press. Leviton LC, Khan LK, Rog D, Dawkins N and Cotton D (2010) Evaluability assessment to improve public health policies, programs, and practices. Annual Review of Public Health 31: 213–33. Mantoura P, Gendron S and Potvin L (2007) The space of participatory research: a tool for implementing innovative and health-promoting local socio-sanitary alliances. Health and Place 13: 440–51. March JG (2004a) Le carcan de la rationalité: laisser coexister des logiques différentes. In: Friedberg E (ed.) La décision [Multimedia Work]. Collection « Questions d’organisation ». Paris: Banlieues Media Editions. March JG (2004b) On ne sais pas ce qu’on veut: comment gérer l’instabilité des préférences: l’hypocrisie. In: Friedberg E (ed.) La décision [Multimedia Work]. Collection « Questions d’organisation ». Paris: Banlieues Media Editions. Meeres SL, Fisher R and Gerrard N (1995) Evaluability assessment of a community-based program. The Canadian Journal of Program Evaluation 10(1): 103–21. Mercer SL and Goel V (1994) Program evaluation in the absence of goals: a comprehensive approach to the evaluation of a population-based breast cancer screening Program. The Canadian Journal of Program Evaluation 9(1): 98–112. Miles MB and Huberman AM (1994) Qualitative Data Analysis, 2nd edn. Thousand Oaks, CA: SAGE. Mintzberg H, Ahlstrand B and Lampel J (1999) Safari en pays stratégie. L’exploration des grands courants de la pensée stratégique. Paris: Village Mondial. Plottu B and Plottu E (2009) Approches to participation in evaluation: some conditions to implementation. Evaluation 15(3): 343–59. Polanyi M (1964) Personal Knowledge. Towards a Post-Critical Philosophy. New York: Harper & Row. Public Health Agency of Canada (2004) The Federal Initiative to Address HIV/AIDS in Canada. Strengthening Federal Action in the Canadian Response to HIV/AIDS. Ottawa: Minister of Public Works and Government Services Canada. Public Health Agency of Canada (2006) AIDS Community Action Program (ACAP) Grants and Contribution Allocation Project 2005 – Final Report. Ottawa: PHAC. URL (consulted 7 September 2009): http://www. phac-aspc.gc.ca/aids-sida/publication/reports/acap-pacs/acap-pacs-eng.php Public Health Agency of Canada (2009) Federal Initiatives to address HIV/AIDS in Canada: National Funding [date modified 2009-02-18]. Ottawa: PHAC. URL (consulted 7 September 2009): http://www. phac-aspc.gc.ca/aids-sida/funding/index-eng.php#chvi Ridde V (2006) Suggestions d’améliorations d’un cadre conceptuel de l’évaluation participative. The Canadian Journal of Program Evaluation 21(2) : 1–23. Rootman I and Steinmetz B (2002) Groupe de travail européen sur l’évaluation de la promotion de la santé. Bulletin de recherche sur les politiques de santé 1(3): 8–9. Laperrière et al.: A socio-political framework for evaluability assessment 259 Springett J (2001) Participatory approaches to evaluation in health promotion. In: Rootman I, Goodstadt M, Hyndman B, McQueen DV, Potvin L, Springett J, et al. (eds) Evaluation in Health Promotion. Principles and Perspectives. Copenhague: WHO Regional Publications, European Series, no. 92, 83–105. Suárez-Herrera J, Springett J and Kagan C (2009) Critical connections between participatory evaluation, organizational learning and intentional change in pluralistic organizations. Evaluation 15(3): 321–42. Thurston WE and Potvin L (2003) Evaluability assessment: a tool for incorporating evaluation in social change programmes. Evaluation 9(4): 453–69. Thurston WE, Graham J and Hatfield J (2003) Evaluability assessment: a catalyst for program change and improvement. Evaluation & the Health Professionals 26(2): 206–21. Wendhausen A (2002) O duplo sentido do controle social: (des)caminhos da participação em saúde. Itajaí: Univali. Wholey JS (1981) Using evaluation to improve program performance. In: Levin RA, Solomon MA, Helistern G-M and Wollmann H (eds) Evaluation Research and Practice: Comparative and International Perspectives. Beverly Hills, CA: SAGE. Wholey JS, Hatry HP and Newcomer KE (1994) Assessing the feasibility and likely usefulness of evaluation. In: Wholey JS, Hatry HP and Newcomer KE (eds) Handbook of Practical Program Evaluation. San Francisco, CA: Josey-Bass, 15–39. Zúñiga R and Luly MH (eds) (2005) Savoir-faire et savoir-dire: un guide d’évaluation communautaire. Montréal: COCQ-sida. Hélène Laperrière is Professor in the School of Nursing, University of Ottawa, Canada. Please address correspondence to: School of Nursing, Faculty of Health Sciences, University of Ottawa, 451, Smyth, Ottawa (Ontario), Canada K1H 8M5. [email: helene.laperriere@uottawa.ca] Louise Potvin is Professor of Health Promotion in the Department of Social and Preventive Medicine, University of Montréal, Canada. Please address correspondence to: Faculty of Medicine, University of Montreal, C.P. 6128, succ. Centre-Ville, Montréal (Québec), Canada H3C 3J7. [email: louise.potvin@umontreal.ca] Ricardo Zúñiga is Professor in the School of Social Work, University of Montréal, Canada. Please address correspondence to: School of Social Service, University of Montreal, C.P. 6128, succ. Centre-Ville, Montréal (Québec), Canada H3C 3J7. [email: ricardo.zuniga@umontreal.ca] Article Evaluation Review 36(5) 375-401 ª The Author(s) 2013 Reprints and permission: sagepub.com/journalsPermissions.nav DOI: 10.1177/0193841X12474275 erx.sagepub.com When is a Program Ready for Rigorous Impact Evaluation? The Role of a Falsifiable Logic Model Diana Epstein1 and Jacob Alex Klerman2 Abstract Background: Recent reviews suggest that many plausible programs are found to have at best small impacts not commensurate with their cost, and often have no detectable positive impacts at all. Even programs with initial rigorous impact evaluation (RIE) that show them to be effective often fail a second test with an expanded population or at multiple sites. Objective: This article argues that more rapid movement to RIE is a partial cause of the low success rate of RIE and proposes a constructive response: process evaluations that compare program intermediate outcomes—in the treatment group, during the operation of the program—against a more falsifiable extension of the conventional logic model. Conclusion: Our examples suggest that such process evaluations would allow funders to deem many programs unlikely to show impacts and therefore not ready for random assignment evaluation—without the high cost and long time lines of an RIE. 1 2 American Institutes for Research, San Mateo, CA, USA Abt Associates, Cambridge, MA, USA Corresponding Author: Diana Epstein, 2800 Campus Drive, Suite 200, San Mateo, CA 94403. Email: diana.epstein@gmail.com 376 Evaluation Review 36(5) The article then develops the broader implications of such a process analysis step for broader evaluation strategy. Keywords design and evaluation of programs and policies, outcome evaluation, program design and development [G]overnment should be seeking out creative, results-oriented programs like the ones here today and helping them replicate their efforts across America. President Barack Obama (2009) Suppose that, like President Obama, one’s goal was to identify new social programs to be rolled out nationwide to address pressing national problems. What process would you design for identifying those programs? This is the stated goal of the Obama Administration’s ‘‘innovations funds’’: the Corporation for Nation and Community Service’s Social Innovation Fund, the Education Department’s i3/Investing in Innovation Fund, and the Department of Labor’s Workforce Innovation Fund.1 In response to conventional practice that sometimes goes directly from social problem and plausible program ideas to broad-scale implementation, external observers have urged—and government programs have moved toward—inserting two rounds of rigorous, but costly in time and money, impact evaluation ‘‘tollgates’’ (see Figure 1). The first round is an ‘‘efficacy evaluation’’ under ideal conditions and the second round is an ‘‘effectiveness evaluation’’ under more realistic conditions.2 With this approach, only programs that pass these double tollgates proceed to broad-scale rollout.3 Proponents of these dual tollgates, sometimes called randomistas,4 argue that the only way to know if a program ‘‘works’’ is through rigorous impact evaluation (hereafter RIE) such as random control trials.5 When implemented, such RIEs have a very high rate of negative results; that is, they find no clear evidence of positive program impacts.6 This high prevalence of negative findings both emphasizes the necessity of such RIEs and suggests the need to modify the process of moving from program idea to broad-scale rollout. This article contributes ideas toward developing processes for identifying programs to be evaluated and moving those programs through a sequential evaluation process. Our emphasis on the process of program evaluation follows from our sense as evaluators that RIE is unavoidable. Like drug trials based on a combination of solid biology and animal trials, many plausible social programs Epstein and Klerman 377 Efficacy Evaluation Program Idea Replication Broad-scale Rollout Effectiveness Evaluation Figure 1. Rigorous impact evaluation before broad rollout. simply will not be effective. Without RIE, current understanding will not distinguish program models that will be effective from program models that will not be. RIE is needed to distinguish plausible programs that work from plausible programs that do not work. Here, we propose a complementary approach to RIE—one that we project will have lower cost, because programs that are unlikely to be found effective will not proceed to the expensive RIE phase. Given that only RIE can establish whether a program works, how can we increase the rate of positive findings? Building on discussions by earlier evaluation theorists—in particular, Wholey (1994) and Weiss (1997)—we argue that what we call a ‘‘falsifiable logic model,’’ combined with process evaluation methods carefully applied to the problem at hand, can help identify programs that are unlikely to be found effective by an RIE. Building on this earlier literature, our contribution is twofold. First, we explain how these ideas can be incorporated into the tollgate framework that is increasingly being used in the evaluation of new federal programs and, more broadly, into government procurement strategies for evaluation. Second, we enumerate five specific forms of logic model failure. For each of those five specific forms, we show by example that applying our proposed approach plausibly would have screened out programs that failed RIE. Figure 2 summarizes our proposed approach, which augments the RIE phase of Figure 1 with an iterative sequence of formative evaluation and process evaluation. In this augmented model, a process evaluation serves 378 Evaluation Review 36(5) as an additional tollgate: Only a program that passes its own logic model as measured by the process evaluation proceeds to the impact evaluation stage. Sometimes a promising program model needs several iterations before a workable approach can be found. If we evaluate too early, we will deem a failure a program model that after another round (or rounds) of development might be deemed a success. Sometimes even after several iterations, however, a promising program will still not pass this evaluability tollgate. We believe that our approach serves two important purposes: first, the falsifiable logic model (FLM) we propose has a winnowing function, that is, our proposed approach helps to narrow the list of programs that proceed to RIE. Second, it has a program-improving function; that is, our proposed approach helps a program to strengthen its implementation and eventually be more likely to be deemed successful by RIE. Because some programs would never be subject to RIE and because other programs would only be subject to RIE after program refinement, more of the programs subjected to RIE would ultimately be deemed successful. The balance of this article proceeds as follows. The next section situates our ideas in earlier evaluation theory and the second section describes in detail how we suggest applying these ideas to the randomistas’ tollgates. These ideas are only useful if our proposed ‘‘logic model approach to evaluability’’ would actually screen out programs as not ready for RIE; that is, if it can identify programs that would fail RIE before RIE commences. To show that our logic model approach can screen out programs, the third section provides detailed examples of five forms of logic model failure that could have been identified by a process analysis. Having shown that, at least in some cases, our logic model approach to evaluability would have screened out programs, the fourth section considers how to incorporate the logic model tollgate into the evaluation process. The fifth second considers some broader implications of our approach for evaluation strategy. The final section considers the applicability of the approach. Earlier Discussions of ‘‘Evaluability’’ This article aims to resuscitate ideas first proposed by some of the founding scholars of program evaluation and situate them in the modern movement toward RIE. Our argument builds on Campbell’s ‘‘contagious crossvalidation model,’’ in which multiple program options would first be tested with a less rigorous screen, and only the most successful would then proceed to RIE (Campbell 1984). Epstein and Klerman 379 Figure 2. Logic model approach to evaluability. Our approach further builds on the work of Joseph Wholey and what he termed evaluability assessment.7 In Wholey’s words, evaluability assessment ‘‘is a process for clarifying program designs, exploring program reality, and—if necessary—helping redesign programs to ensure that . . . program goals, objectives, important side effects, and priority information needs are well defined, program goals and objectives are plausible, relevant performance data can be obtained, and the intended users of the evaluation results have agreed on how they will use the information’’ (Wholey 1994). Our focus on the intermediate outcomes specified by the program’s theory is closely related to the ‘‘theory of change’’ approach to evaluation proposed by Weiss (1997, see Chapter 3, and especially p. 608). She argues: ‘‘They, or better still, theories, direct the evaluator’s attention to likely types of near-term and longer-term effects . . . provid(ing) early indications of program effectiveness. It [the evaluation] need not wait until final outcomes appear (or fail to appear).’’ Furthermore, this information can have a constructive effect, pointing to how the program might be improved: ‘‘If breakdown occurs in the premises of implementation, the implications are immediate and direct: Fix the way the program is being run.’’ The balance of this article operationalizes this idea and proposes how it might be embedded into the tollgate approach.9 Specifically, we refine and provide examples of how logic models10 can help us to decide what to evaluate. Previous use of logic models has helped program developers to get inside the black box of what a program did or did not do (e.g., Rogers et al. 2000; Pawson and Tilly 1997; Leviton and Gutman 2010); we extend this idea to program evaluation. While our approach relies heavily on a program’s logic model, our approach is perhaps less theoretical. We posit that, like management consulting, formative evaluations and process evaluations proceed from an implicit model of how 380 Evaluation Review 36(5) programs should operate. Given this implicit model, formative evaluators and management consultants infer how program implementation should change in order to increase program impact. As empiricists, we place less confidence in implicit models and we are therefore reluctant to use program theory in this way. However, the essence of our argument is that a program’s logic model implicitly embodies a program theory, and a necessary condition for the program to be effective is that it succeeds according to its own program theory. Our approach therefore requires a program to make its program theory explicit in the form of what we call an FLM. Program’s that do not satisfy the requirements of their own FLM can be screened out from more expensive RIE. The Logic Model Approach to Evaluability Specifically, our approach expands the role of the logic model. Conventionally, the logic model is a tool for thinking through causal pathways; in our approach, the logic model becomes a tool for evaluating whether a program is satisfying its own stated approach. The foundation of our approach is a requirement that an expanded logic model specify detailed—and falsifiable—goals for one of the components of a conventional logic model— intermediate outcomes that must be realized by members of the treatment group in order for the program to succeed. As will be clear in what follows, our FLM is a major extension of conventional logic models. Conventional logic models specify a sequence of steps; our FLM includes considerably more implementation detail, often including specific quantitative benchmarks for intermediate outcomes. Such intermediate outcome goals could be both quantitative and qualitative.11 Thus, a training program’s logic model might specify intermediate benchmarks such as: (i) space is secured and a curriculum developed; (ii) a specified number of instructors complete a specified number of hours of training; (iii) classes of a specified size are recruited; (iv) instructors teach with fidelity to the curriculum (and fidelity has a precise operational definition); (v) students attend all or most of the classes (with a quantitative standard of what fraction of students will attend what fraction of classes); (vi) students master material, as measured by gain of X percent on a pretest/posttest; and end impacts such as (vii) students find jobs that use acquired competencies; and (viii) students are retained in those jobs for at least X months. Asking program developers to specify their logic models in the level of detail required for an FLM is valuable for four reasons. The first reason underlies the conventional role for a logic model: developing a logic model Epstein and Klerman 381 at this level of explicitness and detail helps program developers to refine their program vision. A program model is more than an ‘‘idea’’ or an ‘‘insight.’’ In developing a program, it is crucial to think realistically. Programs are rarely experienced in their ideal form. Not everyone will attend all of the sessions; not everyone will pass the final exam. What are realistic expectations? As important, a program model should specify what intermediate outcomes are needed in order for the program to have an impact. Can the program be cost-effective if classes are not full? Can the program have the projected impact if only two thirds (half?) of the students attend all (at least three quarters?) of the sessions? The process of thinking through the details of program operation should identify issues, lead to improved program design, and thereby result in better program outcomes. This is the conventional argument for urging new programs to develop a logic model. Requiring a logic model at this level of detail seems reasonable before giving a program funds for a pilot program. The second reason for asking program developers to specify their logic model in detail is more subtle. When trying to ‘‘sell’’ the program, program proponents have a strong incentive to overpromise. In the absence of a strong evidence base, setting target outcomes may be somewhat arbitrary. A program that promises to yield large impacts for many people is inherently more attractive than one for which only small impacts are promised.12 In contrast, at the evaluation stage, program proponents have an incentive to lowball their estimates. If they overestimate or even give their best guess, it is possible that the program will not meet the stated performance goal at RIE and will perhaps be cancelled.13 These conflicting pressures could operate to induce what game theory calls ‘‘truth telling’’; that is, providing program proponents a stronger incentive to give more realistic estimates.14 The strength of these two opposing pressures will vary from case to case. They will rarely exactly cancel out, such as to induce complete ‘‘truth telling.’’ Nevertheless, the knowledge that stated benchmarks will be used in evaluation should improve the realism of claims. As a result, the stated benchmarks for intermediate outcomes in our proposed FLM could therefore help in selecting which new programs to fund for the first time and which existing programs to continue funding. The third reason for asking program developers to specify their logic model in detail is that such falsifiable goals are crucial to our proposed approach for determining when a program is ready for RIE. In the pilot phase, our approach compares the specified benchmarks for a successful program and for the follow-on RIE against observed intermediate outcomes. 382 Evaluation Review 36(5) The fourth reason for asking program developers to specify their logic model in detail concerns evaluation. In as much as there is an RIE tollgate, it must be possible to conduct an RIE, and RIE designs have certain implicit assumptions about program implementation that can be checked at the pilot. For example, random assignment with the standard equal assignments to treatment and control groups requires twice as many applicants as program slots. Many evaluations fail for insufficient applicants, and the implicit assumption of having more applicants than could be served could be checked in a pilot. We acknowledge that in current practice, logic models often lack the specificity required by our approach.15 For example, the IES evaluation report on the Even Start literacy program states that the children had very different amounts of exposure to the program. This is not (necessarily) a result of failed implementation, rather it was the case that ‘‘Even Start guidelines do not specify an expected level of exposure for children or parents, and the hours of instruction offered by local projects vary widely’’ (Judkins et al. 2008). As this example illustrates, it is our premise that falsifiable goals are a reasonable degree of specificity for a program seeking substantial funds. We conclude this section by acknowledging that satisfying the benchmarks for program operation and outputs and outcomes in the treatment group specified in such an FLM will often not be sufficient to achieve the desired end impacts. Even if there is improvement in outcomes for the treatment group such that benchmarks are satisfied, there may not be any impact. Control groups often also show improvement—sometimes due to regression to the mean, sometimes due to other similar programs in the community. Thus, even a program that satisfies its own logic model for outputs and outcomes for the treatment group (e.g., pre/post progress on a standardized test) may fail an RIE for impact on long-term outcomes, that is, outcomes for the treatment group relative to outcomes for the control group (e.g., earnings).16 Nevertheless, a program that cannot achieve the intermediate goals specified by its own logic model will not, according to its own logic model, have the desired (usually longer term) impacts and therefore should not proceed to RIE.17 Crucially, note that this determination of whether a program achieves the intermediate outcomes specified by its own logic model can often be made using conventional process evaluation methods, that is, careful observation of program operation, without random assignment and without a comparison group. In addition, in most cases and by design, this determination should be possible at low cost. Our approach is attractive exactly because it is inexpensive—it only requires program operating records (e.g., enrollment and attendance) and end-of-program tests. Thus, to be useful, an FLM Epstein and Klerman 383 needs to specify benchmarks for intermediate outcomes that can be measured without a follow-up survey (given its expensive tracking of former program participants) and without a need to locate and survey nonparticipants (a control group or comparison group). Five Forms of Logic Model Failure If most evaluated programs could pass this additional tollgate—their own logic models—then this logic model approach to evaluability would not be useful; it would not screen out programs. If the logic model approach does not screen out programs, then it does not aid the broader RIE process. The opposite is true. Many evaluated programs fall short of (the currently implicit) expectations on these intermediate outcome tests, thus failing their own logic models. Other program models may not fail their FLM, but they do fail to meet the conditions required for the follow-on RIE. This section specifies and provides examples of five common forms of failure of programs to satisfy their own logic models: (i) failure to secure required inputs; (ii) low program enrollment; (iii) low program completion rates; (iv) low fidelity; and (v) lack of pre/post improvement. Then, for each form of failure, we discuss how it could have been detected by a process evaluation. The sources for these examples are our own experiences and observations, as well as discussions with experts and years of reading the evaluation literature. We recognize that similar lists exist elsewhere in the program development and evaluation literature. Furthermore, we do not intend for this to be a checklist of midpoint accomplishments that all programs must pass. Rather, we hope that identifying these common sources of logic model failure will promote additional critical thinking during program development and implementation. Specific FLMs will need to specify their own benchmarks—sometimes applying the broad categories listed here, at other times adding new categories. In as much as these ideas are applied, it seems almost certain that this list should be expanded and refined. Here it serves primarily as an organizing device for our proofs of concept; that is, examples of benchmarks that could have been prespecified and were not achieved. Failure to Secure Required Inputs. One way in which a programs fails its own logic model concerns the inputs into the program. Program models often implicitly posit the ability to establish interorganization partnerships and to recruit and retain certain types of staff. Sometimes those partnerships never materialize or the staff cannot be recruited or retained. For example, the Employment Retention and Advancement (ERA) program in Salem, 384 Evaluation Review 36(5) Oregon struggled with both high turnover among case manager staff and a difference in philosophies between staff recruited from welfare agencies and those from community colleges. These implementation challenges affected service delivery and hence the benefits that participants were able to obtain from the program (Molina, Cheng, and Hendra 2008). Thus, an FLM should specify the partnerships to be established, the qualifications of the staff to be hired, and the projected retention rate of those staff. Relatedly, sometimes the way the program operates will not support RIE. An example comes from an Abt Associates evaluation of evidencebased practice in probation that aimed to see whether caseload reductions among probation officers reduced criminal recidivism and increased revocations for technical violations (Jalbert and Rhodes 2010). An agency in Oklahoma City agreed to implement a randomized controlled trial and probation officers volunteered either to be assigned to a reduced caseload or to maintain a regular caseload. However, the experiment degenerated because a large number of officers with regular caseloads either left the agency or accepted a different assignment. This disrupted the equivalency of the random assignment groups and reduced the study’s power such that the evaluators had to switch to a difference-in-difference study design. Implicitly the evaluation design assumed no turnover among officers, but a pilot program would have detected this problem. Each of these intermediate outcomes—partnerships to be established, qualifications of staff to be hired, and staff retention rates—could have been checked using conventional process analysis methods. Low Program Enrollment. A second way that a program fails is that it does not attract the target number of clients/participants. Programs are only worth running if there is a demand. Program models implicitly assume that there is a need for the program—and that people will enroll. In practice, that often is not the case. Underenrollment relative to plan is a fundamental logic model failure and quite common. The best document examples of underenrollment arise in RIE. For random assignment to be feasible, a program needs to have a surplus of clients (usually double). The ideal situation for RIE is therefore an existing program with a long waiting list. When such a long waiting list exists, random assignment can often be viewed as the most ethical approach to deciding who will be served. When there is not currently a waiting list, sometimes a program can attract additional clients through advertising and recruiting. For a program that is already achieving its target enrollment, resources expended on Epstein and Klerman 385 recruiting and advertising are arguably wasted, but moderate additional expenditures on advertising and recruiting will yield the larger number of applicants required to implement random assignment. However, a program that cannot even attract the target size of the treatment group (or is expending considerable resources to do so) is not ready to recruit sufficient program applicants to implement random assignment. In practice, many RIEs have trouble recruiting applicants at the target rate. Underenrollment or extended enrollment periods to achieve target sample sizes are not the exception, they are the rule. Sometimes the RIE is cancelled, as was the case in the Portland Career Builders ERA program which recruited a random assignment sample only a third of the target size (Azurdia and Barnes 2008). Other times the RIE limps forward on less than the target number of applicants. In either case, an earlier process study could have detected recruitment challenges and would have indicated that the program was not ready for an RIE. Of course, a successful program can usually expect even more demand. An existing program develops a referral network. Claimed impacts and individual success stories help build demand. Nevertheless, it is exactly our premise that we should not be moving to RIE until there is a fully rolled out program considerably larger than the pilot program, even in the pilot site. In that case, at least moderate excess demand (i.e., a waiting list) is a reasonable requirement before proceeding to—and a requirement for successful implementation of—RIE. Low Program Completion Rates. A third way that a program fails is that sometimes participants initially enroll in the program but do not complete the expected treatment. We know this is a failure of the logic model because, ex post, reports of rigorous impact studies point to failure to complete the treatment as the reason for null results. For example, the report on the rural welfare to work strategies evaluation attributes the lack of substantial impacts to the fact that only about two fifths of the target clients received substantial Future Steps services (Meckstroth et al. 2006). In the South Carolina Moving Up ERA program, only half of the program group was actually engaged in the program’s services during the year after they entered the study. Of those who were engaged, the intensity of engagement varied such that some were only minimally involved in the program (Scrivener, Azurdia, and Page 2005). Another example is the Cleveland Achieve ERA program, where participation varied widely such that overall the intensity was less than the program designers had envisioned (Miller, Martin, and Hamilton 2008). 386 Evaluation Review 36(5) Similarly, Building Strong Families’ logic model asserted that multiple group sessions on relationship skills, combined with other support services to unmarried couples around the time of their child’s birth, could improve relationship skills and thereby increase marriage rates and decrease divorce rates (Wood et al. 2010). However, in all but one of the sites, less than 10% of the couples received at least 80% of the curriculum. Only in the one site where 45% of the couples received at least 80% of the curriculum was there an impact on measures of relationship quality. It is our reading of the literature that there is often a presumption that most (perhaps nearly all) of the sessions must be attended in order for the program to have its full effect. This is clearly true in a program that gives a certificate or a degree. In other programs such a presumption is signaled by the fact that final sessions often include some special capstone or wrap-up activity. A presumption that most sessions need to be completed is sometimes implied by the fact that we fund all of the sessions, that is, if we thought the later sessions were not needed, we would not have funded them. Universal attendance at every program session is not realistic. Nevertheless, below some threshold, impacts seem extremely unlikely. Thus, an FLM should specify what defines ‘‘enrollment’’ and what fraction of those enrollees need to attend what number of sessions in order for there to be a measurable (perhaps cost-effective) impact. Then, actual attendance can and should be measured against the FLM’s stated benchmarks. Low Fidelity. A fourth way in which a program fails is that sometimes clients attend, but the program as implemented falls short of what was envisioned in the logic model. Thus, for example, across each of the four supplemental reading comprehension programs, a Mathematica random assignment evaluation found no evidence of consistent impact. However, the study also found evidence of far from complete implementation of the curricula themselves. Only about three quarters of the targeted practices were actually implemented (James-Burdumy et al. 2010). Similarly, Abt’s random assignment evaluation of a national student mentoring program found no consistent pattern of impact (Bernstein et al. 2009). In explaining that null result, the report noted that the average amount of mentoring received was lower than the benchmark provided by model community-based mentoring programs. Thus, an FLM should specify what constitutes (sufficient) fidelity of implementation. That specification of implementation with fidelity needs to be specific enough to be falsifiable. What defines how the program would be implemented ideally? How will fidelity be measured? What Epstein and Klerman 387 amount of deviation from that ideal implementation would constitute failure? This step has important implications for program development as well. Once program developers have defined implementation with fidelity, they should go back to their plan for training and supervising staff. Do those training materials make clear how it is expected that the program will be implemented? Have the instructors been tested to assure that they have mastered the techniques and learned the expectations? Does the program model include sufficient supervisory resources and a supervision scheme that can reasonably be expected to lead to implementation of the program with fidelity? This part of the development of the logic model will lead to additional falsifiable outcomes for the first way that programs fail (insufficient inputs, e.g., staff recruited and trained) and additional falsifiable outcomes for the fourth way that programs fail (low fidelity, e.g., was the supervisory plan implemented— perhaps determined by checking supervisor reports). Lack of Pre/Post Improvement. The fifth way in which programs fail is that sometimes clients show minimal or no progress on pre/post measures of the intermediate outcome the program was intended to affect. Pre/post measures are subject to well-known biases—in particular, history and maturation; this is why most pre/post designs are not considered RIEs. Nevertheless, it is often true that pre/post provides a plausible lower bound. For example, we often expect improvement in the control group as well, such that pre/post is a lower bound on impact. History might cause pre/post not to be a lower bound; for example, a weakening economy pulls post outcomes for the treatment group below pre outcomes, but perhaps the pre/post decline would have been even larger in a control group. Whether this critique will be valid will vary from evaluation to evaluation and will depend on how significant the changes were in the external environment (e.g., economic conditions). In general, choosing outcomes that are more under the control of the program—for example, test scores on the material taught rather than labor market outcomes—will lessen the salience of the history critique. Thus, in an evaluation of National Evaluation of Welfare to Work Strategies (NEWWS), clients randomized to the Human Capital Development (HCD) component showed no progress on objective achievement tests (Bos et al. 2002). In as much as the logic model for these HCD programs implied that earnings would rise because clients learned basic skills in reading, writing, and mathematics, the program was a failure. NEWWS HCD programs did increase the number of people who received a General Educational Development (GED) diploma, absolutely 388 Evaluation Review 36(5) and relative to the control group. Here and in general, the specifics of the logic model matter. If the program’s logic model had posited that earnings will rise because clients get a GED, even if they do not learn anything, then it might be reasonable to proceed with RIE. However, the HCD program’s logic model had specified actual learning. In this sense, we propose to hold programs to their own logic models and, conversely, not to let programs define achieved outcomes as success ex post (after they see the results; in this case, GEDs but no improvement in test scores). Thus, an FLM should specify pre/post changes in outcomes for the treatment group, for example, skill attainment, progress on standardized tests, graduation rates, receipt of certificates. To be a useful screening device for programs that should be subjected to RIE, these measurements must be inexpensive. Not all enrollees will still be around at the end of the program. Thus, we would not want a standard in terms of true outcomes for all enrollees; we cannot easily observe outcomes for initial enrollees who do not complete the program (for whatever reason). We also do not want measures conditional on completion (the people we observe). Rather, we want standards in terms of the incoming class, for example, half of the entering class will get a certificate through the program. People who complete but do not get a certificate and people who leave the program before completion would count toward the denominator (program enrollees), but not toward the numerator (those receiving a certificate). The program might like to claim credit for those who get a certificate outside the program, but those certificates are not easily measured (and are probably not due to the program), so it is probably better to define the standard to not include such certificates received by those who initially enrolled in the program, but who did not complete the program and get their certificate through the program. As in all performance management systems, in such FLMs it will be crucial to carefully define ‘‘enrollees.’’ The Department of Labor’s Workforce Investment Act training programs sometimes delay official ‘‘enrollment’’ until the program is relatively sure that a trainee will complete the program. For most purposes, a more appropriate definition will be people who receive any program services. Finally, note also that such performance standards give program operators strong incentives for ‘‘cream skimming,’’ that is, only enrolling trainees who are likely to meet the standard. Program funders need to be aware of those incentives. If trainees most likely to meet the standard are not the target population, then there is another standard for the second way in which programs fail (insufficient enrollment). Enrollment standards need to Epstein and Klerman 389 specify not only the number of enrollees but also enough about their characteristics to assure that the target population is being enrolled. Conversations with researchers and research sponsors suggest that each of these logic model failures is common. These failures reflect gaps in program planning and development. They are also embarrassing and therefore rarely end up in the formal literature. Sometimes as a result of these logic model failures the study is cancelled and therefore no report is produced. More frequently, the program limps through the RIE: a different partnership is attempted; staff standards are relaxed; the intake period is held open much longer than expected. These changes from the original design are mentioned only in passing, if at all. Finally, sometimes the problems—poor attendance at sessions or minimal progress even in the treatment group—are not noticed until the project’s final report, where they are not mentioned at all or used to explain why the program did not find impacts. Crucially for our argument, note the common pattern of these examples. Each of these programs was subjected to expensive and time-consuming RIE. Those RIEs found no impact, overall or at most program sites. The RIEs were, however, unnecessary. Through site visits and analysis of program records, a process evaluation could have collected information on partnerships and staffing, initial enrollment, attendance at sessions, and pre/post progress on the target intermediate outcomes. That information could have been compared against a more detailed version of the conventional logic model. Those comparisons would have clearly demonstrated that, according to the program’s own logic model, the program was unlikely to have impacts and therefore was not ready for RIE. Fitting the Logic Model Step Into an Evaluation Process The previous section has argued that through a process evaluation it is possible to screen out many programs as not ready for RIE. In this section, we point out three other direct implications of this negative screen. First, programs that fail their own logic models do not necessarily need to simply be discarded without further consideration. The premise of technical assistance (management consulting, in the for-profit world) is that, through observing a program as it currently operates, an experienced outsider can suggest ways to improve the program’s operation. Perhaps with those improvements, the program will meet its own logic model. That such technical assistance—or even just another cycle of program operation—will help is more plausible for relatively new programs. In initial cycles of program operation steps often clearly fail, but the experience of failure suggests 390 Evaluation Review 36(5) ways to improve those steps. Unfortunately, however, federal demonstration programs sometimes subject brand new programs to RIE. Thus, United States Department of Agriculture’s (USDA) Health Incentive Pilot was designed and implemented with a random assignment evaluation beginning from the first month of operation. Similarly, USDA’s Summer Electronic Benefit Transfer (EBT) for Children program was established and received a pilot (i.e., small sample) random assignment evaluation in its first year of operation and a full and quite large random assignment evaluation in its second year of operation. Similarly, federal training programs are often funded by short-term contracts, with an RIE requirement. But, this means that we apply RIE to a very early implementation. It seems plausible that lessons learned in the first year or two of operation might lead to program refinements, improved program implementation, and improved client outcomes. The possibility of revising programs such that they will meet their own logic models suggests building such a ‘‘formative evaluation’’ or ‘‘technical assistance’’ step into evaluations—either once the program fails its own logic model, or even before proceeding to the process evaluation. However, a caveat is in order. Among the activities of formative evaluations is to help a program refine its logic model. However, the FLM should be set before the program is implemented. Once the program is implemented, formative evaluations and technical assistance should focus on changing details of program implementation to satisfy the benchmark intermediate outcomes in the program’s FLM as stated initially. After seeing the result, simply lowering the quantitative benchmarks—how many sessions a client needs to attend, how much progress the client must make on an achievement test—is too easy. Instead, a program with a significantly different logic model, or even substantially different quantitative goals, should often be viewed as a totally new program—and be required to recompete for funding with other new or existing programs. Second, even if sometimes we can refine a program such that with the refinements it will meet the benchmarks in its FLM, it does not follow that we should (always) fund such program refinement efforts. Some program models that fall short of expectations in their first (or second, or nth) iteration will fulfill their own logic models with another cycle through formative evaluation and then process evaluation. Other program models should simply be terminated: these programs were initially promising, they have been tried, and they have been deemed ineffective (i.e., they failed to achieve their own logic models’ intermediate outcomes). Limited resources should be transferred from these failed programs to other promising program models. Epstein and Klerman 391 The challenge, of course, is deciding which programs to iterate and which programs to simply terminate. Unfortunately, we have no clear guidance on when to continue investing and when to move on. Foundation program officers, social venture capitalists, and government employees currently make these decisions. As of now, our only guidance is to carefully consider the trade-off (in time and resources) between continuing to invest in a program that can be saved versus scrapping it entirely and instead beginning to explore some other program that offers a different, and perhaps better, approach. Perhaps making the trade-offs of each choice explicit will improve decision making. Additional insights into this choice would clearly be useful. Figure 2 suggests the third implication of our negative screen. Current evaluation strategy for new government programs often includes some formative evaluation/technical assistance and even some process evaluation. However, those steps are often funded simultaneously with, and as part of the same contract as, the impact evaluation. Formative evaluation/technical assistance is provided immediately before and as part of proceeding to random assignment. Such a single contract shrinks the necessarily long interval from program idea to broad-scale rollout. The disadvantage of this approach is that a single-contract approach implicitly assumes that most programs will go from program idea through formative evaluation and process evaluation to impact evaluation. However, the thrust of our argument is that many (perhaps most) programs will fail at the process evaluation stage and therefore should not proceed (at least immediately) to RIE.18 That many (perhaps most) programs should not proceed immediately from process evaluation to RIE is an additional reason not to issue one contract for both ‘‘program development’’ and for RIE. Instead, this line of argument supports issuing a first contract for ‘‘program development,’’ that is, technical assistance and formative evaluation. If the program and the formative evaluator decide the program is ready, then proceed to competition for a second contract for a process evaluation that would compare intermediate outcomes to the benchmarks specified in the program’s own FLM. If the program and the formative evaluator instead decide that the program is not ready for the process evaluation, then the program should apply— competitively—for another round of technical assistance and formative evaluation. For programs that proceed to the process evaluation phase, at the end of that phase the evaluator would prepare a report and the funder would choose between three options: (i) proceed to a third competition for a new contract to conduct the RIE; (ii) proceed to a competition for another round of program development and process evaluation; or (iii) terminate funding for the program.19 392 Evaluation Review 36(5) While it is true that some things just cannot be rushed, separate contracts will often be an extreme solution. Separate contracts have costs—in time and in coordination. Short of separate contracts, contractual terms might be specified cognizant of the likelihood that it will often turn out not to be worthwhile to proceed from formative evaluation to process evaluation and from process evaluation to impact evaluation. Thus, contracts might be structured in two parts, with a clear implication that the second part is not certain. A formal, and perhaps external, review step could be introduced. To counteract the incumbent evaluator’s bias toward proceeding to RIE (additional funding), the reviewers could be instructed to carefully examine the evaluator’s process evidence and to lean against proceeding. Some Broader Implications for Evaluation Strategy The logic model approach to evaluability and program development described in this article embodies a strong implicit assumption about program motivation; this approach assumes that a program’s ultimate goal is to grow from program idea to broad program rollout. It further assumes that the only way to do so is by satisfying the various evaluation tollgates. In practice, neither of those assumptions is universally correct. First, with respect to the desire to move promptly through evaluation to broad program rollout, this might be plausible if: (i) the primary goal of program developers is to get their programs rolled out nationally as quickly as possible and (ii) program developers have complete faith in the evaluation process. Neither of those conditions is likely to be satisfied. Some program operators would be content to run small local programs. In some cases, this is because their vision truly is local. In other cases, there is a fear of RIE (Campbell 1969). Such program operators believe that they are achieving their desired results; they try to avoid RIE because they fear the program might—in their view, incorrectly—be deemed ineffective. Other program operators do not believe that their program can be meaningfully evaluated with a random assignment approach, perhaps because important benefits are not (easily) measurable.20 Still others feel that limited resources should be used to serve clients and should not be diverted toward RIE. For each of these groups of program operators, prolonged ‘‘program development’’ will often be the ideal outcome. Second, with respect to the necessity of program evaluation in order to proceed to broad-scale rollout, some recent initiatives are consistent with this perspective. The evidence-based Nurse-Family Partnership ($1.5 billion over 5 years) and the evidence-based Teen Pregnancy Prevention Epstein and Klerman 393 Program ($110 million in FY2010) each provided substantial funding for the broad rollout of programs that have passed RIE at both the efficacy and effectiveness level (Orszag 2009a). As Orszag explained: ‘‘[This approach] will also create the right incentives for the future. Organizations will know that to be considered for funding, they must provide credible evaluation results that show promise, and be ready to subject their models to analysis.’’ From this perspective, it appears that the Education Department’s Investing in Innovation Fund (i3), for example, takes the right approach. Rather than funding only program pilots or formative evaluations, i3 awarded the largest amounts of money to promising existing programs and gave preference to programs that are supported by stronger evaluation evidence. However, this approach is the exception rather than the rule. As Orszag (2009a, 2009b) and others acknowledge, many programs without RIE evidence and even some programs with negative RIE evidence continue to be funded, often at high levels. Given this reality, avoiding RIE may also be a viable strategy for program developers to pursue. These two factors imply that evaluators will often need to induce programs to participate in RIE. Once evaluators need to induce programs to participate in RIE, it is not clear that they can insist that programs develop FLMs and participate in the long time line and onerous sequence of evaluation steps described here. Our guidance is simple: this sequence of evaluation steps should be a requirement (or at least a major plus factor) of funding for pilot programs and for proceeding to broad-scale rollout of existing programs. We understand that the reality diverges from that simple guidance. That divergence will make it more difficult to implement the sequence of evaluation steps described in this article. More consideration of these issues is needed. Closing Thoughts and Next Steps This article has argued that some programs are undergoing an RIE too early and that more resources should be devoted to determining whether a program is ready for RIE. A more detailed FLM combined with a careful process evaluation would frequently detect programs that failed, for example, to: (i) complete partnerships or hire the desired staff; (ii) recruit sufficient qualified program participants; (iii) induce program participants to engage with the complete program; (iv) implement the program with sufficient fidelity; and (v) improve program participants’ outcomes relative to the program’s own goals. Having failed to satisfy the intermediate 394 Evaluation Review 36(5) outcomes of the program’s own FLM, this program is unlikely to have positive long-term impacts and therefore should not proceed to RIE. From this insight—that some programs can be rejected based on the results of benchmarks for the treatment group during (or shortly after) the end of the program—emerges an answer to the question with which we began this article: What process would you design for identifying programs worthy of broad-scale rollout? Our suggested process was depicted in Figure 2: (i) require the program to specify an FLM as part of its application for funding; (ii) fund a pilot with a corresponding formative evaluation; (iii) if repetition of the formative evaluation step is needed, decide whether to fund it, or alternatively to abandon the program model; (iv) proceed to a process evaluation that verifies the satisfaction of the program’s own FLM; (v) if repetition of the process evaluation step is indicated, decide whether to fund it, or alternatively to abandon the program model; (vi) proceed to an RIE efficacy trial; (vii) if the program passes the efficacy trail, proceed to replication; (viii) if the program does not pass the efficacy trial, (perhaps) repeat the formative evaluation and the process evaluations steps (ii–v); (ix) proceed to an RIE effectiveness trial; and (x) if the program passes the effectiveness trial, proceed to broad rollout. This article’s advocacy of the use of formative evaluation and process evaluation should not be used as an excuse to delay RIE. We would urge a bias toward either proceeding to the process evaluation and then RIE, or terminating the program. Partially to encourage programs to move on to RIE when appropriate, another round of formative evaluation and process evaluation should be far from automatic. Forcing programs that fail their own logic models to reapply for funding for formative evaluation—alongside other programs that have not had even one round of formative evaluation—is one possible strategy. We conclude by acknowledging that the approach described here is unlikely to be adopted completely. Our approach would convert a process that already often takes nearly a decade into a process that will often take considerably more than a decade. We have argued that our proposed approach is consistent with good science and the reality that most programs subjected to RIE do not work. Nevertheless, further lengthening of evaluation cycles is, for at least two reasons, potentially problematic. First, our proposed approach is inconsistent with the political cycle. Policy makers and politicians face strong pressures to be seen as ‘‘doing something’’—and soon! With this attitude, evaluation resources for any given program area may not be available for a long time and the approach proposed here is therefore infeasible. Second, Epstein and Klerman 395 it is sometimes plausible to argue that the effectiveness of the program is itself changing. Society is evolving quickly; an approach that worked 10 years ago may no longer work 10 years from now. Even within our framework, several strategies exist to shrink any increase in time lines. First, conduct the process analyses relatively quickly and expect prompt reporting of results. Second, issue one conditional contract (see the earlier discussion about how to minimize the bias of doing so). Third, lean against repeated cycles of formative evaluation and process evaluation. We acknowledge that the more salient either of these two reasons (i.e., the political cycle and shifting impact), the less attractive will be the lengthening of evaluation time lines implied by our approach. Those crafting research strategies will need to weigh our proposed approach’s longer time lines against its possible advantages: finding more programs that work by helping allocate scarce evaluation resources only toward those programs that are truly ready for RIE. Acknowledgments The authors thank numerous people who suggested the examples. This paper only represents the position of the authors, not of those who suggested examples, or of Abt Associates, the American Institutes for Research, or their research clients. Declaration of Conflicting Interests The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article. Funding The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Jacob Alex Klerman’s time on this paper was funded with internal Abt Associates research funds. Notes 1. See http://www.whitehouse.gov/administration/eop/sicp/initiatives/innovationfunds. In fact, the ideas in this article were partially stimulated by efforts related to those programs and their discussions about ‘‘evidence standards’’ (see e.g., Shah and Jolin 2012). 2. This division into ‘‘efficacy evaluation’’ and ‘‘impact evaluation’’ is the pure case (e.g., Society for Prevention Research 2004). It is useful in that initial implementations and impact evaluations often occur under the most favorable conditions and with the explicit intent of demonstrating that the program can work. Thus, such initial implementations often benefit from higher levels of funding than would be 396 3. 4. 5....
Purchase answer to see full attachment
User generated content is uploaded by users for the purposes of learning and should be used following Studypool's honor code & terms of service.

Explanation & Answer

Attached.

1

Running head: RESPONSE TO READING

Response to Reading
Institutional Affiliation
Date

RESPONSE TO READING

2

When Is A Program Ready For Rigorous Impact Evaluation? The Role of A Falsifiable
Logic Model
The article seeks to investigate the effectiveness and reliability of RIE (rigorous impact
evaluation) on programs. The article claims that increased rapid movement to rigorous impact
evaluation is a factor leading to the low effectiveness and reliability of RIE Program evaluation
which compares the intermediate outcome in the group treatment during process operation. An
evaluation of the examples presen...


Anonymous
Just the thing I needed, saved me a lot of time.

Studypool
4.7
Trustpilot
4.5
Sitejabber
4.4

Similar Content

Related Tags