DEVELOPMENT AND PSYCHOMETRIC EVALUATION OF A CLINICAL BELIEFS QUESTIONNAIRE FOR LICENSED PSYCHOLOGISTS

In recent decades, a specific class of dubious clinical practices has been labeled pseudoscientific and highlighted as a growing area of concern in psychology. Experts have identified numerous examples of pseudoscientific treatments, which are troubling for various ethical reasons. In light of the absence of research investigating the nature of professional beliefs and knowledge associated with scientifically substantiated and unsubstantiated clinical interventions, the primary objective of this study was to develop a questionnaire (viz., the Clinical Attitudes and Knowledge Questionnaire, or CAKQ) to appraise specific clinical knowledge domains and attitudes toward science among licensed, doctoral-level practitioners of clinical psychology. This aim was pursued through generating items designed to detect the presence of knowledge pertaining to (a) legitimate and questionable treatment techniques used in contemporary clinical practice; (b) general clinical psychology research (e.g., controversies relevant to applied practice); and (c) clinical judgment and decisionmaking procedures. A preliminary scale consisting of items addressing practitioner attitudes toward science in clinical psychology was also created. A secondary study aim was to ascertain whether psychologists’ professed knowledge varied in relation to years involved in clinical practice. Two thousand randomly selected licensed psychologists in New England engaged in clinical practice were invited to participate in the study, and the final sample size was 324 participants. Statistical analyses indicated that the initial hypotheses were partially supported. The hypothesis that a four-component solution would best summarize the CAKQ data was not supported by principal components analysis results. However, the hypothesized relationship between clinical knowledge and critical thinking skills was partially supported. Consistent with expectations, a lower reliance on intuitive thinking styles was associated with greater clinical knowledge. Finally, the hypothesis that total number of years of clinical experience would not predict higher clinical knowledge scores was also upheld. Study limitations and future research directions were discussed.

eloquently argued that although having one's heart in the right place, or "softheartedness," was essential to guiding humane approaches to economic policies, a reticence to steer the development and implementation of such policies with well-informed critical thinking, or "hardheadedness," may result in undesirable or even disastrous consequences.
Likewise, in clinical psychology, effective practitioners should be capable of not only exhibiting warmth and empathy during therapeutic interactions with clients, but also be willing to keep strictly intuitive and emotionally-driven preferences at bay while scrutinizing the evidentiary value of assessments and treatments to be considered and implemented. Unfortunately, "hardheaded" critical thinking appears to be underutilized (and possibly underappreciated) among a considerable percentage of practitioners within the field (see Gaudiano, Brown, & Miller, 2011;Sharp, Herbert, & Redding, 2008), which is concerning given the potentially baneful ramifications for vulnerable mental health clients.
A specific class of questionable clinical practices has been labeled pseudoscientific in recent decades and highlighted as a special area of focus and concern in psychology Pignotti & Thyer, 2009;Still & Dryden, 2004). Despite difficulties formally defining pseudoscience, which is largely attributable to the complexities of the enduring demarcation problem in philosophy of science (Derksen, 1993;Lakatos, 1974;Laudan, 1983;Pigliucci & Boudry, 2013;Resnik, 2000), there is some consensus on specific hallmarks or warning signs distinguishing pseudoscience from science. These include: (a) a disproportionate focus on hypothesis confirmation at the expense of adequate testing and refutation; (b) attempts to shield the fundamental principles (or hard core) of a research program from falsification through an ongoing invocation of auxiliary hypotheses, which Lakatos (1970) termed a degenerative research program when taken to extremes; (c) evasion of the peer review process; (d) the use of obscurantist jargon to create a superficial veneer of scientific legitimacy; (e) a consistent lack of self-correction and subsequent stagnation of ideas; (f) shifting the burden of proof from claimants to skeptics (e.g., declaring that the onus lies squarely on the critics of an approach to adduce evidence against it); (g) an absence of specified conditions under which claims do not hold (i.e., the delineation of boundary conditions); (h) lack of connectivity with related areas of scientific knowledge; and (i) an overemphasis on personal anecdotes and testimonials to lend credence to claims (Bunge, 1984;Lilienfeld, 2005, September;Lilienfeld et al., 2003; see also Ruscio, 2005). Thus, pseudoscience can be conceptualized as "nonscience masquerading as genuine science" (Lilienfeld, 2010, p. 286) and often contains expansive or extraordinary claims that lack essential supportive evidence, contradict well-established scientific findings, and/or overstep the boundaries of current scientific knowledge.
Regardless of how we choose to categorize and label ineffective and/or potentially harmful clinical methods, which has been hotly debated in the psychological literature (see McNally, 2003, and multiple spirited responses in the same issue), there appears to be overall agreement among scholars that such methods are concerning for various ethical reasons (e.g., the violation of general principles and specific standards of the American Psychological Association (APA) Ethics Code [2002], such as therapists taking care to do no harm and allowing research evidence to guide their practices). From another perspective, the dissemination of ineffective clinical methodologies can be viewed as akin to a negative externality in economic terms. That is, there are clear social, emotional, psychological, and financial costs when ineffective (or possibly harmful) interventions are provided to mental health consumers, but clinicians who regularly use them do not necessarily "pay" for these costs. Rather, clinicians may reap financial profits from offering suboptimal services in place of effective ones, which ultimately subtract from (or, in some cases, endanger) client welfare. Furthermore, certain clinicians may never be held accountable for damages inflicted upon their clients unless formal ethics complaints and/or legal charges are filed.
Alongside ethical violations, the proliferation and perceived acceptance of pseudoscience in clinical psychology arguably contributes to unflattering images of the field as partially evidenced by a raft of controversial articles published in 2009 in high-quality scientific journals (e.g., Nature; "Psychology: A reality check," 2009), well-regarded psychological journals (e.g., Psychological Science in the Public Interest; Mischel, 2009), and popular news magazines (e.g., Newsweek; Begley, 2009, October), many of which openly questioned the legitimacy of applied clinical practice.
In the wake of these accusations, the APA responded not by encouraging further research on the veracity of such claims (e.g., by surveying practitioners on preferred practices) or investigating negative perceptions of the field, but rather by calling for strategies designed to persuade skeptical scholars and the general public to view the field as worthy of science, technology, engineering, and mathematics (STEM) status (APA, 2010, June). The APA's response arguably ignores and obscures a largely furthermore, such questionnaires understandably lack items tailored to clinical psychological practices (e.g., McLean & Miller, 2010;Morier & Keeports, 1994;Vyse, 1997;Wesp & Montgomery, 1998). In fact, only three studies to date (viz., Gaudiano et al., 2011;Sharp et al., 2008) have directly examined clinicians' perspectives on evidence-based and non-evidence-based practices, and these investigations were relatively narrow in scope (e.g., reported theoretical orientations and circumscribed areas of suboptimal practices, such as TFT).
An exhaustive review of clinical practices that arguably rest on pseudoscience exceeds the scope of the present work. Many such reviews already exist in peerreviewed journals and book chapters (see previous citations, especially Lilienfeld et al., 2008) and reflect the thoughtful scrutiny of recognized experts in this area of inquiry. Instead, this dissertation attempted to address a current gap in the literature by using survey methodology to examine what practicing clinical psychologists report to know in the context of clinical practice. Collected data were also analyzed for potential associations with pertinent cognitive/information processing variables. The current study thus sought to redress the abovementioned shortcomings in the literature through a preliminary investigation of clinical knowledge profiles among licensed psychologists (e.g., knowledge of the updated treatment and clinical decision-making literature).

Clinical Science and Evidence-Based Interventions
"What is this thing called science?" 1 Numerous pithy characterizations of science abound in both popular culture and within the profession of clinical psychology. Renowned physicist Richard Feynman (1985), for example, described scientific inquiry as a bending over backward to refute one's own hypotheses. In his oft-quoted Caltech commencement address, in which he discussed the critical missing elements of "Cargo Cult Science," he stated, "It's a kind of scientific integrity, a principle of scientific thought that corresponds to a kind of utter honesty-a kind of leaning over backwards" (Feynman, 1974, p. 11). Astronomer and astrophysicist Carl Sagan is commonly known for attributing the essence of science to rigorous critical thinking as opposed to a body of knowledge (Sagan & Druyan, 1997). The father of the scientist-practitioner model in clinical psychology, David Shakow (1976), echoed similar sentiments in championing critical thinking over blind faith in particular theories or wishful thinking about desired research outcomes.
When describing the intellectual habits of his academic mentors at University of Minnesota, Paul Meehl (1993) acknowledged philosopher Bertrand Russell's "dominant passion of the true scientist-the passion not to be fooled and not to fool anybody else" (p. 728). He further noted that if clinical psychologists forego this passion and forget "the two searching questions of positivism: 'What do you mean?
How do you know?'", they become "little more than be-doctored, well-paid soothsayers" (Meehl, 1993, p. 728). Some psychologists emphasize intellectual honesty combined with a strong resistance to allowing political correctness to influence knowledge, especially out of a desire to feel more "comfortable" about reality (e.g., Hunt, 1999) while others (e.g., Beyerstein, 1997;Lilienfeld, 2010Lilienfeld, , 2012 maintain that the scientific common ground among allied psychological disciplines and the "hard sciences" (e.g., physics and chemistry) includes protection against confirmation bias combined with a rigorous and conscientious ferreting out of erroneous beliefs and preconceptions.
Despite the aptness and elegance with which luminaries from different fields of study have articulated certain core components of science, it would be impossible to summarize the multifaceted nature of science with a single sentence. Indeed, entire books have been devoted to capturing the essence of science (e.g., Carey, 2011;Chalmers, 1999;Gauch, 2003;Schurz, 2014). Virtually all such books agree that furthering our understanding of nature is made possible by a foundational characteristic of science that distinguishes it from other methods of truth seeking-the scientific method. At its most basic level, the scientific method entails observation, testing, and explanation (Carey, 2011). Summarized in more detail (albeit not exhaustively), this process may proceed as follows (cf. Box, Hunter, & Hunter, 1978;Gauch, 2003): Seeking to discover the truth about a phenomenon of interest, a scientist first obtains existing information about the phenomenon (e.g., previous research and observations). This information assists with shaping initial educated guesses about the phenomenon, which ultimately take on tightly defined and testable hypotheses that subsequently steer the design and methodological features of the study. Pre-established procedures guide the systematic collection of data, which contain truth about reality intermingled with noise (i.e., various kinds of error), the latter being diminished as much as possible via sound research design and conscientious scientific conduct. Next, data bearing on the hypotheses are summarized and appropriately analyzed, and conclusions and inferences are carefully formulated with an eye toward observed results. Finally, future research directions, study limitations, new hypotheses, and/or possible theoretical adjustments are submitted for consideration, and the entire study is summarized in written form to be peer reviewed by the scientific community. If the study stands up to the critical scrutiny of other independent scientists and is replicated by additional investigators, the findings are added to the provisional corpus of scientific knowledge until further revised or later disconfirmed.
An armamentarium of cognitive approaches and logical tools is put to use as the scientific method is applied. This encompasses general problem solving skills, critical thinking, skepticism, inductive logic, deductive logic (or, perhaps more accurately in most practical contexts, abductive reasoning; Fann, 1970), eliminative parsimony, probabilistic thinking, falsification (i.e., attempts at hypothesis disconfirmation), relational reasoning, causal inference, and analogical reasoning (Dunbar & Fugelsang, 2005;Gauch, 2003). Epistemologically, these modes of thinking are typically accompanied by a form of Peircean fallibilism (Peirce, 2011) in contemporary scientific inquiry, which acknowledges the influences of error and uncertainty in rendering knowledge provisional. In line with fallibilist and probabilistic modes of thinking, inferences drawn in scientific psychology are tainted by layers of uncertainty and are thus subject to the dominion of probability theory (unlike pure mathematics and formal logic). Because determinism in psychology is untenable, psychological theories cannot be sufficiently proved in the strong Euclidian sense (Meehl & MacCorquodale, 1991). Rather, psychologists are tethered to more cautious and tentative articulations of research findings. They can declare, for example, that theories are rendered "more likely" or "more credible" based upon the preponderance and comparative weighing of both confirmatory and disconfirming evidence, but declarations of formal proof are erroneous.
Thus, a central objective of scientific research is to arrive at the best possible approximations of various aspects of reality (e.g., a correspondence theory of reality or semi-hemi-demi scientific realism; O'Connor, 1975;Irwin, 1988;Russell, 1912) using sound reasoning principles and methodological tools. This epistemological clarification is especially important given that the pursuit of "absolute certainty" is futile "concerning questions of fact" (Peirce, 2011, p. 59). However, scientists do their utmost to allow features of reality (versus strictly socio-cultural constructions of reality) to dictate the content of truth propositions (Boghossian, 2007). In general terms, nature should ultimately control what scientists take to be "correct" answers about it (Irwin, 1988). Thus, truth is conceptualized as an accurate representation of the state of nature insofar as the scientific method can elucidate, although the presence of error and tentativeness of conclusions about worldly phenomena should always be borne in mind.
All of the aforementioned perspectives are compatible with scientific skepticism, which is rooted in the acknowledgement of a complex world that is unlikely to be fully understood by bounded human understanding (Devilly & Lohr, 2008, p. 107). They are also congruent with Donald Campbell's (1974) concept of "evolutionary epistemology" in science, or the ferreting out of erroneous claims over time via the scientific method in a manner akin to natural selection. Ideally, psychological researchers consistently and conscientiously adhere to disciplined cognitive habits (i.e., following the scientific method and utilizing relevant critical thinking skills) when conducting treatment outcome research, and clinical practitioners ideally draw from the same cognitive toolkit when making important decisions about the care of mental health clients. This perspective of scientifically informed researchers and practitioners is consistent with David Shakow's original formulation of the scientist-practitioner model in clinical psychology (Baker & Benjamin, 2000;Cautin, 2008;Shakow, 1969).
The scientist practitioner model. The advent of the scientist-practitioner (or "Boulder") training model occurred during the Boulder Conference on Graduate Education in Clinical Psychology in 1949 (Raimy, 1950). Conference participants agreed that research-informed practice was critical to the maturation and survival of clinical psychology, and the doctor of philosophy (Ph.D.) degree was decreed the legitimate basis for professional licensure (Baker & Benjamin, 2000). From the perspective of Boulder model proponents, practical clinical training must go hand-inhand with rigorous research training given that these professional activities mutually inform one another in important ways. The ideal scientist-practitioner is thus someone who has the skills and knowledge base not only to make informed and responsible decisions about clinical care (i.e., by being able to understand, evaluate, and selectively apply research findings), but also to produce and publish novel research to advance the field (Jones & Mehr, 2007). As Belar and Perry (1992) (Donn, Routh, & Lunt, 2000;Mitchell, 1977). Relative to Ph.D. programs, Psy.D. programs are more likely to: (a) focus more heavily on applied clinical work and practice-oriented course material, (b) deemphasize data-driven research endeavors (although incorporating research into practice ideally remains an emphasis), and (c) not require a formal dissertation involving data collection and quantitative data analysis, although many such programs do require scholarly projects and non-data-driven dissertations (e.g., theoretical papers) (Donn et al., 2000;McIlvried et al., 2010). In addition, Psy.D. programs, many of which are housed in non-university, for-profit schools, typically accept more students 2 , offer less financial aid, and place their graduates in more strictly applied clinical settings (vs. academic or research settings) (McIlvried et al., 2010).
These observations are presented here to illustrate key differences between the two dominant training models in contemporary clinical psychology and not to argue for the superiority of one model over another per se, although it may be fair to hypothesize that the typical student may be more likely to encounter stronger scientific training in a scientist-practitioner or clinical science Ph.D. program compared to the average Psy.D. program. Indeed, as Baker and colleagues (2008, p. 85) suggested, a large number of Ph.D. programs likely only pay "lip service" to scientific foundations, and a small number of Psy.D. programs do offer rigorous science-based curricula. At any rate, differences often highlighted between prototypical scientist-practitioner Ph.D. and practitioner-based Psy.D. programs (i.e., predominantly research-vs. practice-based) unavoidably touch upon a broader concern in contemporary clinical psychology: the rift between science and practice.
The term "scientist-practitioner gap" (Cautin, 2011;Tavris, 2003) is often used to refer to the schism between sound research and effective practice in clinical psychology, although the terminology obviously focuses on the divide between the professionals themselves. Other researchers refer to this same problem more generally as the "science-practice gap" (Lilienfeld, 2013). From one angle, this gap may be viewed as the consequence of researcher-practitioner alienation or isolation (e.g., lack of overlap in professional contact and activity), which may inadvertently contribute to poor clinical care (Teachman et al., 2012). overly objective approach to a subjective enterprise that attempts to micro-manage their professional activities (Pinsof, Goldsmith, & Latta, 2012, p. 253 Researchers are expected to engage in disciplined and systematic observation, hypothesis development and testing, data collection and analysis, problem solving, critical thinking, and the formulation of results and conclusions based on the available data (Devilly & Lohr, 2008, pp. 106-108 (e.g., scientifically-informed assessment and treatment selection and implementation) (Baker, McFall, & Shoham, 2008). However, it is presently unclear what has come of these efforts.
McFall (1991) along with other dedicated psychological scientist (e.g., Klerman, 1990) strongly advocated for the use of randomized controlled trials (RCTs) in clinical research, which is one of the cornerstones of clinical science. RCTs require random assignment of participants to treatment conditions and tightly controlled procedures to ensure (as best as possible) that participants across groups are treated the same with the exception of the intervention component(s) unique to their specific condition (Bolton, 2008). Stated another way, the crux of controlled clinical trials is to isolate and identify treatment components that exert beneficial effects (e.g., symptom reduction and/or improvements in life functioning) surpassing the effects (or lack thereof) observed in comparison conditions, which may involve no-treatment conditions, waiting-list controls (WLCs), psychological placebos 3 , or standard practice treatments already deemed effective, which typically comprise the treatment as usual (TAU) category (Bolton, 2008;Kazdin, 2003).
Whereas participants in WLC conditions temporarily receive no treatment at all (i.e., until the conclusion of data collection), those in placebo-control conditions still receive an intervention, albeit subtracting out the critical treatment ingredients hypothesized to confer additional psychological benefits (i.e., over and above no treatment, non-specific therapeutic factors, or TAUs). Psychological placebos assumedly contain all of the generally beneficial yet incidental factors (e.g., treatment credibility, expectancy effects, therapeutic alliance influences, etc.) divorced from specific therapeutic elements (e.g., interoceptive exposure for panic symptoms) hypothesized to confer unique improvements (Herbert & Gaudiano, 2005). Thus, placebo conditions have at least some surface credibility vis-à-vis active treatments (see O'Connor et al., 2007, p. 185, for a clear description of an attention placebo control procedure compared to a CBT condition for treating delusional symptoms).
Of course, psychological placebos, which are vastly different from the inert sugar pills used in pharmacological research (e.g., sugar pills obviously do not engage in human relationships), are not without their share of limitations. These include a lack of perceived treatment credibility to client participants (e.g., transparency of assignment to placebo-control groups), poor congruence of structural features of psychological placebo-control groups with target treatments (Baskin, Tierney, Minami, & Wampold, 2003), and challenges controlling for and matching expectancy effects between control and treatment conditions (Boot, Simons, Stothart, & Stutts, 2013). In addition, a substantial portion of what has been traditionally viewed as noise or nuisance variance associated with psychological placebos actually consists of legitimate but understudied non-specific factors inherent in all forms of mainstream psychotherapy (Norcross, 2011;Omer & London, 1989;Wampold, 2001;Wampold et al., 2010). Devilly and Lohr (2008) described three key categories of non-specific effects, namely: (a) common factors, or recognized ubiquitous features of most treatments imparting therapeutic benefit (e.g., therapist attention, persuasion, expectancy effects, and so forth); (b) unspecified but active factors, where "unspecified" refers to palliative features of treatments not explicitly pinpointed as active ingredients (e.g., interpersonal influences embedded in therapeutic procedures and sociocultural contexts); and (c) factors without specific activity, or factors inherent in many treatments that more diffusely allay symptoms through ambiguous (non-specific) mechanisms of action (e.g., in the field of medicine, the example of aspirin being used to address disparate physical ailments). Obviously, there is much conceptual overlap among and ambiguity within these broad categories. Furthermore, it is difficult to find methodologically sound studies focusing on concrete, isolatable examples of these factors in contemporary psychotherapy research, although one exception would be the work of Bruce Wampold (2001), who posits a contextual model of psychotherapy (i.e., attempting to clearly delineate and systematically study broad classes of non-specific therapeutic actions). And finally, conceptual problems emerge when one recognizes the arbitrariness of distinguishing specific from non-specific factors, which may be dependent on the clinician's theoretical orientation (e.g., accurate empathy would be considered a specific factor in humanistic psychotherapies but a non-specific factor in CBT) (Herbert & Gaudiano, 2005, p. 897).
Although the effects of non-specific factors can be powerful (see Wampold, 2001) and certainly should not be dismissed, these effects alone are often insufficient for establishing treatment efficacy and differential effectiveness among different forms of psychotherapy (e.g., which treatments work for which conditions, and under what circumstances). Component-controlled efficacy studies assist in clarifying which specific ingredients of a particular novel treatment work with ceteris paribus applied to non-specific (e.g., interpersonal and contextual) factors as best as can be achieved in this complex domain of research (Devilly & Lohr, 2008). Of note, the importance of examining critical mediators and moderators of therapeutic change likewise should not be underestimated in this context, nor should the value of longitudinal designs for addressing treatment outcome stability and process variables (e.g., specific mechanisms of change 4 over time) lurking both within and beyond the pre-and posttreatment interval (Laurenceau, Hayes, & Feldman, 2007). Despite their limitations, RCTs comprise a rigorous scientific research design that far exceeds many other clinical decision-making methodologies (e.g., subjective clinical impressions) and remains at the heart of efficacy studies, which will be described next in more detail.
Efficacy and effectiveness research. Efficacy studies focus primarily on symptom reduction and usually are conducted in research clinics staffed by specially trained clinicians and research assistants (Baker et al., 2008;Kazdin, 2003).
Methodological approaches that typically define efficacy research include random assignment, double-blind procedures, placebo or other appropriate treatment comparison groups, standardized assessment (e.g., structured or semi-structured diagnostic interviews), and adherence to pre-specified inclusion and exclusion criteria for research participants (Baker et al., 2008;Chambless & Ollendick, 2001;Campbell & Stanley, 1963). Properly applied, these procedures work together to strengthen the internal validity of efficacy studies, which entails ruling out a complex host of rival alternative explanations (Campbell & Stanley, 1963). In the context of treatment outcome research, specific threats to internal validity include (but are not limited to): (a) the placebo effect (e.g., the combined effects of expectancy, compliance, suggestion, distraction from symptoms, etc.); (b) the self-limiting nature of many psychopathological conditions (i.e., gradual natural recovery irrespective of treatment); (c) spontaneous remission; (d) cyclical symptom variation over time; and (e) the impact of self-serving cognitive biases on self-report data (Beyerstein, 1997).
As Beyerstein (1997) noted, clinical scientists have a professional and ethical responsibility to establish that treatments are safe and effective, the latter almost always being the most challenging in research due to the necessity of ruling out these rival hypotheses. Failure to address and control for these threats via random assignment, the use of placebo-controlled groups, and double blinding often results in an unwarranted attribution of observed recovery (e.g., symptom alleviation or remission) to dubious interventions (see "The Twilight Zone of EMDR" section of this dissertation for examples of published studies lacking these critical safeguards).
However, if these controls are properly implemented, and the magnitude of outcome improvement in the treatment group is meaningfully larger 5 than observed improvement in the placebo (or no-treatment, WLC, or TAU) group, this suggests treatment efficacy, especially when replicated (Beyerstein, 1997).
In contrast to efficacy studies, effectiveness studies are quasi-experimental and are designed to boost external validity (also termed generalizability or ecological representativeness), or the degree to which research findings may be soundly extrapolated to real-world clinical scenarios (although additional cross-validation procedures are critical for establishing external validity; e.g., see Hoeppner et al., 2012). The core question here is the degree to which the intervention works in other applied clinical settings outside of the boundaries of the study. This determination is pursued by relaxing stricter degrees of experimental control (e.g., random assignment) in order to conduct treatments in more naturalistic, representative settings where the usual clinical staff members deliver their typical target interventions to client participants, who are not subjected to as stringent inclusion and exclusion criteria.
Some scholars perceive a tension between efficacy and effectiveness research (i.e., rigor and relevance, respectively) and note that the more tightly controlled a study, the less results will generalize (see Gelso, 1985). In contrast, Baker and colleagues (2008), frame effectiveness research as bridging the gap between research and non-research clinics, and despite some differences in research outcomes comparing efficacy and effectiveness studies, effect size estimates are mostly concordant (Lambert, 2013;Nathan & Gorman, 2007). Ideally, clinical interventions that are consistently well supported by replicated efficacy and effectiveness results should be broadly disseminated across mental health consumer populations with the objective of serving as many clients in need as possible (McHugh & Barlow, 2010).

Empirically supported treatments.
It is a well-replicated finding that, on average, mental health clients receiving psychotherapy evidence more symptom remission and functional gains than those either receiving a psychological placebocontrol intervention or not undergoing therapy (see Lambert, 2013 originally called empirically validated treatments or EVTs; Chambless et al., 1996).
Lists of ESTs with accompanying definitions of categories of evidentiary support were subsequently published (Chambless et al., 1996; see also Chambless & Ollendick, 2001).
According to Division 12 Task Force criteria (see , treatments deemed well-established must be supported by "at least two good betweengroup design experiments" with evidence of efficacy via either (a) demonstrated beneficial effects over and above placebo or other treatment, or (b) equivalence to a previously established treatment (e.g., TAU) with "adequate sample sizes" (p. 4). An alternative way for treatments to achieve this status is by demonstrating efficacy in "a large series of single-case design experiments" (conducted by at least two different research teams) involving "good experimental design," treatment comparison conditions, treatment manuals (or, alternatively, explicitly defined intervention components and steps), and specified sample characteristics (Chambless et al., 1998, p. 4). One step below well-established treatments are probably efficacious treatments, which may meet any one of the following three criteria: (a) demonstrated superiority to WLC in at least two separately conducted experiments, (b) meeting all necessary well-established treatment criteria with the exception of independent replication by at least two different research teams, or (c) meeting well-established treatment criteria using a "small series of single-case design experiments" (Chambless et al., 1998, p. 4 Chen et al., 2014;Swartz et al., 2005), and cost effective preventative treatment that consistently improves and maintains smoking cessation outcomes across diverse client populations (Hollis et al., 2000;Maciosek et al., 2006;Sheffer et al., 2009) despite its apparent underutilization (Shiffman, Brockwell, Pillitteri, & Glitchell, 2008). Similarly favorable long-term research results support the use of CBT strategies (especially behavioral interventions) for major depressive disorder (Honyashiki et al., 2014), panic disorder with or without agoraphobia (Otto & Deveney, 2005), bulimia nervosa (Cooper & Shafran, 2008), posttraumatic stress disorder (Foa, Gillihan, & Bryant, 2013), generalized anxiety disorder (Bolognesi, Baldwin, & Ruini, 2014), obsessive-compulsive disorder (Lewin, Wu, McGuire, & Storch, 2014), hypochondriasis (Olatunji et al., 2014), reduced risk of psychosis (Hutton & Taylor, 2014), and various other psychological conditions (see Chambless & Ollendick, 2001, for a comprehensive list of ESTs for specific conditions and Lambert, 2013, for further details). In addition, aside from CBT, interpersonal psychotherapy is also considered a well-established treatment for depression, as is behavioral family therapy for schizophrenia, behavioral marital therapy for marital discord, and brief psychodynamic therapy for geriatric depression Chambless & Ollendick, 2001).
Many vociferous concerns about the concept of ESTs and their use in clinical practice have been raised (e.g., Westen, Novotny, & Thompson-Brenner, 2004). Stewart and colleagues (2012) summarized and offered rejoinders to some of the most commonly encountered (and often fallacious) objections, including supposed lack of generalizability to real-world clinical practice, appeals to clinical intuition and expertise as superior or equal to research findings, the specious yet aggressively pervasive belief in the Dodo Bird verdict (i.e., the unfounded claim of universal treatment equivalence due to the therapeutic alliance, hope, empathy, etc.; see also Hofmann & Lohr, 2010, January), and the charge that ESTs are "unfair" because they overwhelmingly favor CBT or strictly behavioral interventions over other forms of psychotherapy. More thoughtful and discussion-worthy objections include the observation that placebo-control conditions and WLCs set the evidentiary bar too low (Herbert & Gaudiano, 2005), especially in light of the Edinburgh Revision to the 1964 Declaration of Helsinki (World Medical Association, 2013); and the lack of researchbased guidelines to steer idiographic tailoring of ESTs (i.e., scientifically justified specifications for adapting specific approaches to specific conditions for certain kinds of clients) (Holt & Beutler, 2014). The present author agrees with the position that if provisional EST lists indeed turn out to boost the ratio of effective, scientifically supported treatments to largely ineffective, scientifically counterfeit approaches, then these lists are worth compiling and implementing despite their unavoidable shortcomings (Chambless & Ollendick, 2001;Lilienfeld, 2010). However, the objection to the unwieldy impracticality of tailoring many different manualized treatment protocols to various DSM disorder categories is well taken (Wachtel, 2010) and may justify studying principle-focused approaches instead (e.g., the Unified Protocol for Transdiagnostic Treatment of Emotional Disorders, which has received some preliminary research support; Bullis, Fortune, Farchione, & Barlow, 2014).
As an important aside, this discussion of ESTs is unavoidably intertwined with the contemporary healthcare climate in the United States. Specifically, as healthcare delivery systems, expenditures, and economic decision making (esp. of mental health care stakeholders) have continued to change over the past three decades, a growing prioritization of cost-effective mental health interventions has emerged (Baker et al., 2008). Treatments consistently shown to alleviate psychological distress (using assessment tools that quantify symptom severity) in a relatively brief time period over and above the effects of rival interventions will more likely survive the everincreasing pressures of managed care (Hayes, Barlow, & Nelson-Gray, 1999). That is, parsimonious interventions (e.g., cognitive-behavioral therapies) supported by scientific research will likely fare better regarding incorporation into the healthcare delivery system and receive insurance reimbursement (Kauth, Sullivan, Cully, & Blevins, 2011). Thus, in the face of dramatically rising costs and service demands in mental and behavioral health care (e.g., see Poisal et al., 2007), it remains incumbent upon doctoral-level practitioners of clinical psychology to offer interventions that are effective, efficacious, cost-effective, and scientifically sound (Baker et al., 2008;Beecham et al., 1997) and to monitor real-life clinical outcomes for accountability and quality improvement purposes (e.g., Hodges & Wotring, 2004). These key quality-ofcare criteria are already guiding healthcare coverage decisions as insurance companies and governmental agencies continue to oversee increasingly large swaths of funding in these areas, and this trend is expected to continue well into the future ( Although limited in number, later research studies yielded similarly sobering results (e.g., Horan & Blanchard, 2001, winter;Hays et al., 2002). For example, an exploratory survey of 133 APA-accredited clinical internships found that a mere 28% of the sites provided more than 15 hours of supervision and training in ESTs (e.g., CT, CBT, DBT, and IPT), and furthermore, that 30% of the sites dedicated either minimal (19%) or no (11%) time for EST training opportunities (Hays et al., 2002). Finally, a survey of 172 graduate students drawn from 60 APA-accredited clinical, counseling, and school doctoral psychology programs revealed that close to two-thirds of respondents had never read a single evidence-based treatment related publication (e.g., Task Force articles, treatment manuals, etc.), and approximately 32% had never taken a course covering EST content or research (Karekla, Lundgren, & Forsyth, 2004).
Missing from these surveys is information regarding what exactly many graduate students are spending their time learning during their pre-doctoral-training years in the place of scientifically supported interventions. As Lilienfeld (2010) conjectured, much of the time not dedicated to learning effective treatments may instead be spent learning about non-specific factors (e.g., warmth, empathy, listening, etc.) and/or less-than-optimal intervention techniques. Among the latter lie pseudoscientific treatments, which contribute to the professional marginalization of the field of clinical psychology.

The Contours of Pseudoscience
What is pseudoscience? In the contemporary climate of second-generation managed care and professional accountability, it is incumbent upon mental health professionals to choose their interventions with deference to best evidence (Hayes et al., 1999). To do otherwise, such as on the basis of appeals to novelty or what is emotionally appealing to the therapist or patient, potentially carries harmful consequences for clients and professionals alike. Although there are many widely available evidence-based treatments with sound theoretical underpinnings alongside responsible practitioners who adhere to them, pseudoscientific theories and treatments remain pervasive in the field of clinical psychology (see Lilienfeld et al., 2003).
Far from being a novel phenomenon in the current professional landscape (see Gardner, 1957), pseudoscience 6 has been repeatedly recognized over time as a major threat to both public welfare and the scientific foundation and integrity of the field (Lilienfeld, 1998, fall). The proliferation of pseudoscience may be partly attributed to anti-science sentiments and the acceleration of commercial marketing of interventions (Olatunji, Parker, & Lohr, 2005-2006, fall/winter). Some psychologists (e.g., past APA president Ronald Fox) apparently deem clinical outcome research superfluous as evidenced by public pronouncements of the following ilk: "Psychologists do not have to apologize for their treatments. Nor is there an actual need to prove [sic] their effectiveness" (Fox, 2000, pp. 1-2). Other factors likely contributing to the proliferation of pseudoscience in psychology include human credulity combined with poor critical thinking skills, which may persist despite education level or intelligence (Shermer, 2002), vulnerability in the face of refractory mental health conditions, selfserving biases (including cognitive dissonance), and false hope, all of which may hasten the suspension of reason (Beyerstein, 1997;Worrall, 1990).
Formal definitions of pseudoscience have proven difficult. As social psychologist Carol Tavris once remarked at an American Psychological Society symposium, "Pseudoscience is like pornography; we can't define it, but we know it when we see it" (as cited in McNally, 2003, p. 97). However, there is substantial agreement on an interrelated set of general hallmarks or warning flags of pseudoscience (Beyerstein, 1997;Bunge, 1984, fall;Derksen, 1993;Hines, 2003;Lilienfeld et al., 2003;Pratkanis, 1995;Ruscio, 2005;Stanovich, 2003), most of which were outlined in the Introduction of this dissertation. Along with concerned psychologists, philosophers of science continue to aver the importance of the demarcation problem and characterize pseudoscience in similar ways, for example: (a) resemblance thinking, or confusing superficial similarities and causal relationships (see also Greasley [2010] and the "doctrine of signatures" in "magical medicine" described in Hand [1985]); (b) overall resistance to theory evaluation vis-à-vis rival theories and selective sensitivity to hypothesis confirmation versus disconfirmation (i.e., outright neglect of the scientific method); and (c) consistent lack of theoretical and evidential progress over time (i.e., stagnation of ideas) relative to more successful scientific research programs (Thagard, 1978(Thagard, , 1980(Thagard, , 1993. The more features that can be identified within particular psychological theories, assessments, and interventions, the more discerning and skeptically cautious mental health professionals and consumers should become (Bunge, 1984, fall;Lilienfeld et al., 2003).
Although developing checklists of scientific and pseudoscientific features in psychology (e.g., see Lilienfeld, 2005)  or "light" immunization may be defensible. However, the principle of tenacity would be rendered untenable in circumstances involving the reversal of this hypothetical scenario (i.e., few positive results from poorly conducted studies plus many null results from properly designed studies) and would take on an inappropriate form of immunization commonly observed in pseudoscientific "degenerative" research programs (Lakatos, 1970). Thus, hard-and-fast rules are ill advised in this context. Lilienfeld (1998, fall) has likened science and pseudoscience to Roschian concepts (or "open" concepts; Rosch, 1973) given the absence of unambiguous demarcation criteria, although he draws the helpful analogy that distinguishing day from night remains practical despite the absence of a clear-cut line of division between the two. In the context of this metaphor, one may think of some treatments as falling more squarely within the light of the sun (viz., behavioral and cognitive-behavioral strategies) and others as safely confined to nighttime (viz., recovered memory techniques and TFT). However, some treatment packages (e.g., EMDR) present a more complex picture and appear to inhabit a "twilight zone" of sorts given a combination of both helpful and unhelpful components (Antony & Barlow, 2002;Davidson & Parker, 2001;Lohr, Hooke, Gist, & Tolin, 2003). Of note, an in-depth evidentiary overview of EMDR will be provided at the end of this section to illustrate the nuances of disentangling efficacious and effective therapy ingredients from inert ones under the umbrella of a single intervention.
Of interest, most contemporary definitions of pseudoscience in psychology (e.g., Lilienfeld, 1998, fall;Lilienfeld et al., 2003;Ruscio, 2005;Thagard, 1993) have included both content features (e.g., bizarre claims divorced from evidence) and personal reactions to critics (e.g., burden of proof reversal). However, some philosophers of science (e.g., Derksen, 1993; see also Gardner, 1957) and psychologists (e.g., Tolin, 2013, May 28) have contended that the primary concern of genuine scientists should be the "pseudoscientists" themselves, because (as noted by Derksen, 1993), "…it is a person, and not a theory or field, who can have scientific pretensions, and who can be blamed for not making good these pretensions" (p. 21).
The present author contends that it is likely not entirely possible to parse pseudoscientific content from the actions of pseudoscientists as the two most often go hand-in-hand. After all, it is the person who develops and propounds the content, and it is the person who deploys any number of ill-supported (or, in some cases, intellectually dishonest) escape mechanisms (e.g., ad hoc immunization attempts and ad hominem attacks) when repeatedly confronted with disconfirmatory evidence.
At the same time, however, it is difficult to deny that certain content in itself (generated by a person, of course) should never be characterized as far-fetched within the context of the provisional corpus of scientific knowledge. For example, the foundational content of acupuncture, which entails the insertion of needles at "acupoints" along twelve undetectable "meridians" (supposedly connected to specific human organs and analogically corresponding to the 12 great rivers of China) to stimulate the flow of invisible qi (i.e., spiritual "energy" undetectable to physicists) to alleviate illnesses, is utterly denuded of evidence (Bausell, 2007, pp. 113-126;Ernst, 2008;Greasley, 2010;Derry, Derry McQuay, & Moore, 2006;Marcus & McCullough, 2009;O'Connell, Wand, & Goldacre, 2009;Slack, 2010, June) irrespective of the thoughts and behaviors of its most avid historic proponents (e.g., Unschuld, 2003). For this reason, the present author respectfully disagrees with an overly exclusive focus on "pseudoscientists" alone (e.g., Derksen, 1993;Tolin, 2013, May 28). Rather, the author instead suggests that the interplay of all-too-human cognitive shortcomings, powerful emotional convictions, and dubiously formulated content divorced from the current state of the scientific research (and, in some cases, deliberate charlatanry) interact to foment and perpetuate pseudoscience. In addition, it is important to note that even highly respected, mainstream research scientists are not immune from the same cognitive and emotional biases, and thus likewise may utilize questionable tactics to defend certain hypotheses or theories of professional interest when feeling intellectually threatened, caught off guard, and/or bereft of a polished rejoinder (cf. Derksen, 1993, for a more strongly polarized version of this argument, viz., "It should be stressed that the excessively pretentious and uncritical scientist is not 'better' than the pseudo-scientist: he is just more lucky because his theory stands in a critical tradition…" [p. 37]).
The contentious demarcation problem. Much passionate philosophical debate has surrounded the history of distinguishing meaningful and meaningless content (Carnap, 2003), which eventually dovetailed with the perceived meaning and utility of distinguishing science from pseudoscience (Gardner, 1957;Pigliucci & Boudry, 2013;Popper, 2002). In the philosophy of science literature, Larry Laudan (1983) famously relegated the perennial demarcation problem to irrelevancy, dubbing it a "pseudoproblem" and referring to the term pseudoscience as merely a "hollow phrase" doing "only emotive work for us" (p. 125). Instead, he emphasized the central importance of theory confirmation (see also Derksen, 1993;Laudan, 1996). Contemporary objections in clinical psychology include that of Richard McNally (2003), who reviewed a now classic text relevant to the demarcation problem in applied clinical practice (viz., Lilienfeld et al., 2003). Following in Laudan's (1983) footsteps, McNally (2003) argued that pseudoscience is merely an "inflammatory buzzword" serving no useful purpose for disentangling legitimate from illegitimate scientific endeavors. Instead, the primary concern of psychological researchers should be inquiring about the state of the evidence for particular claims to the exclusion of demarcation questions (McNally, 2003), thus arguably emphasizing an astringent form of black box evidentialism (see Shackel, 2013, pp. 421-422) or possibly rote "nosecounting" exercise (Meehl, 1990) with regard to tallying studies with positive outcomes.
Conceding the limitations of a priori plausibility, such as bias introduced by historically predominant yet potentially mistaken scholarly convictions (Ernst, 2003) and the possibility of lapsing into overly dismissive and closed-minded skepticism (Sagan, 1995, January/February), it is difficult to deny that a priori plausibility retains some value for demarcation purposes (see also Beyerstein, 1997, p. 29). For example, when funding agencies are deciding which research to support financially, how would they go about making decisions about treatment outcome studies examining novel interventions (i.e., with no current evidence base) vis-à-vis well-established ones? Is an epistemic free-for-all an economically and pragmatically viable approach whenever a newly proposed intervention emerges? Or would an informed attempt to identify faulty conceptual rationales and mechanisms at the outset prove beneficial for saving valuable time and resources (e.g., those congruent with what is already known to be false and non-scientific, e.g., the principle of analogic correspondences in astrology and herbal remedies; see Greasley, 2010)? Here, an exclusive academic reliance on strict evidential warrant falls short.
It is the author's contention that there is value in continuing to discuss and clarify the demarcation problem in applied clinical practice given the prevalence of interventions known to be outright ineffective and/or potentially harmful to mental health clients . Otherwise, we may waste time and resources unknowingly chasing pseudoscience, all the while lacking an a priori toolkit for separating sense from nonsense. As acknowledged in the philosophy of science literature, there appear to be two demarcation problems, namely, a philosophical conundrum and a practical challenge, the latter encompassing the substantial influence on public policy decisions in education, medicine, law, and scientific research funding (Resnik, 2000). As articulated by Resnik (2003) at the conclusion of his philosophical analysis of the demarcation problem, "Our reaction should be that one can distinguish between scientific and unscientific activities even though one cannot rely on a set of necessary and sufficient conditions gleaned from an abstract theory of science to perform this task" (p. 258). In other words, despite inherent logical and philosophical difficulties in delineating unequivocal boundaries between science and pseudoscience, a set of helpful (albeit admittedly limited) criteria (e.g., Bunge, 1984, fall) may be applied on a case-by-case basis, although questions about the supposed practical effectiveness of demarcation criteria would ultimately require meta-scientific study to be addressed appropriately (cf. Faust & Meehl, 2002). Attention will now be turned to an extended review of EMDR, and it is hoped that this discussion will assist in illustrating the complexities of distinguishing science from pseudoscience in clinical psychology.

The Twilight Zone of EMDR: Between Shadow and Substance
Since its inception, Eye Movement Desensitization and Reprocessing (EMDR; Shapiro, 1989aShapiro, , 1995Shapiro & Forrest, 2004) has remained a hotly debated psychotherapeutic intervention. Although typically lauded as a speedy remedy for PTSD symptoms (Shapiro, 1989a), EMDR is touted as a "breakthrough therapy" applicable to a wide variety of distressing psychological symptoms according to its founder, psychologist Francine Shapiro (Shapiro & Forrest, 2004). Shapiro (1989a;1989b;1994b; has claimed that EMDR can permanently alleviate the symptoms of PTSD in only a couple of sessions and is more effective than extant cognitive behavioral interventions, albeit in the absence of systematic research evidence. Given what is known about the intransigent and often debilitating nature of severe PTSD (among other anxiety disorders), the glowing testimonials and putatively supportive research associated with EMDR should be met with skepticism, and closer scrutiny of the efficacy and effectiveness of this intervention is warranted.

What is EMDR?
The advent of EMDR is not linked to any particular theoretical rationale or compelling logical synthesis, but rather to an anecdotal personal event recounted by Dr. Francine Shapiro. While taking a walk in a park one day in the spring of 1987 and feeling overburdened by distressing thoughts, Shapiro remarked that she instantly felt better after her eyes spontaneously flitted back and forth, thus attributing her improved mood to lateral eye movements (Shapiro & Forrest, 2004). Afterward, she recounted practicing this technique on her friends and acquaintances, many of whom allegedly felt instantly relieved from feelings of anxiety or sadness (Shapiro & Forrest, 2004). Eventually, her experience was translated into the rhythmic back-and-forth visual tracking technique (bilateral sensory stimulation, or BSS) that now defines EMDR (Shapiro, 1999).
The common sequence of steps comprising an EMDR therapy session (cf. Shapiro, 1991;Shapiro & Forrest, 2004) can be summarized as follows. First, the therapist asks the patient to close his or her eyes and imagine the distressing traumatic memory (or a representational image) in vivid detail, much like a typical imaginal exposure. While holding the recalled event in mind, the therapist asks the patient to verbalize any aversive emotional and/or physiological reactions to the event in a sentence. Using a Subjective Units of Distress (or SUDs) scale ranging from 0 (no anxiety) to 10 (extreme anxiety), the patient rates the intensity of psychological distress experienced while imagining the event.
Next, according to the author (cf. Shapiro, 1991;Shapiro & Forrest, 2004), the patient is asked to generate an optimistic statement about the event and subsequently gauge the degree of belief in the positive appraisal using a Validity of Cognition (VoC) scale ranging from 0 (no belief) to 8 (absolute belief). This positive reframing of the traumatic event in conjunction with bolstering its believability constitutes the reprocessing phase of EMDR. While still engaging in the imaginal exposure of the feared scenario, the therapist continues the desensitization process by initiating the technique of BSS, which requires the patient to visually follow the therapist's finger as it sweeps in a lateral, back-and-forth motion approximately 12-14 inches away from the patient's eyes. EMDR therapists typically sweep their finger at a rate of two repetitions per second and total 12-24 repetitions for an average set of repetitions.
Exposure to periodic tones in different ears or finger taps can substitute for finger sweeps if patients are blind or suffer from vision problems (Shapiro, 1994a;. Finally, therapists ask their patients to "blank out" or forget about the traumatic image, breathe deeply, and provide follow-up SUDs and VoC scores after each set of finger sweeps. Finger sweep sets are typically repeated until reported SUDs decrease (e.g., SUDs threshold ≤ 2) and VoC scores increase (e.g., VoC threshold ≥ 6) (see description provided by Lilienfeld, 2008;Shapiro & Forrest, 2004).
The primary difficulty with Shapiro's claims is not necessarily that her proposed bilateral sensory procedure was borne out of a private experience in a park.
Indeed, some ideas emerging from memorable personal experiences, wild hunches, dreams, and other methods of creative or serendipitous freethinking may turn out to be correct upon further testing and corroboration. In his personal correspondence with psychologist Donald Peterson, Paul Meehl pointed out that the German chemist Friedrich Kekulé's reverie of a snake consuming its own tail (viz., an ouroboris) assisted with solidifying the Lewis structure for benzene that remains accepted by chemists to the present day, although other non-dream-related evidence came to bear on this hypothesized structure well before the daydream (Peterson, 2005, pp. 67-68).
Unfortunately, however, it is not possible to know with precision from the history of science how frequently such creative hunches are associated with accurate versus inaccurate scientific findings. That is, we are most likely primarily aware of the successful "hits" as opposed to the "misses," with instances when scientists' creative ideas turned out to be wrong relegated to the dustbins of history. Thus, the dubiousness of the main tenets of EMDR does not necessarily lie in how they were generated per se. As noted by Thagard (1978, p. 225), "Origins are irrelevant to scientific status," although probably not totally irrelevant in this author's opinion (see previous comments on a priori plausibility concerns, plus this is a logician's argument that, while logically correct, merits testing for its accuracy from the standpoint of empiricism). Rather, the central problem is that the proposed mechanisms of action (and their hypothesized effects) have repeatedly failed to withstand the rigors of scientific testing.

How does EMDR purportedly work?
There is a clear consensus among most critics of EMDR that the underpinning theoretical rationale is poorly elucidated and does not square with what is known about the etiology, maintenance, and alleviation of pathological trauma and anxiety (e.g., Keane, 1998;Lilienfeld, 2008;Lohr et al., 2003). As Keane and Barlow (2002) noted, although these criticisms are not grounds for total a priori dismissal of the potential utility of EMDR, it is important to consider the utility of sound conceptual foundations supported by previous research, which in turn may assist with formulating plausible hypotheses and predictions.
To her credit, Shapiro (e.g., 1994b; has attempted to clarify how EMDR works, although its fit with current understanding of neuroscience and models of cognitive behavioral change is highly questionable (Keane & Barlow, 2002).
Specifically, Shapiro (1995;cf. Shapiro & Forrest, 2004) posited that the mechanism behind EMDR relies on accelerated information processing (AIP), an explanatory model purportedly based on neuropsychological principles. In brief, a given traumatic event is thought to impinge upon the nervous system in such a way that distressing information associated with the event becomes encoded without being processed, resulting in neurobiological "blockages." Traumatic memories are thus improperly stored and must undergo more adaptive reprocessing and assimilation in the brain through EMDR techniques, such as the back-and-forth eye movements. BSS purportedly expedites the neuropsychological processing of traumatic material by moving information more efficiently through memory networks, much like unclogging a clogged pipeline. Through this dynamically activated processing system, the traumatic content is unlocked. Of note, this explanation can be characterized as a reification fallacy (cf. Gabel, 1976), or concretizing a conceptual metaphor as a psychophysical mechanism of action. Shapiro (1994b) has also conjectured that beneficial effects of EMDR may result from mimicking eye movements similar to those observed during rapid eye movement (REM) sleep, which supposedly aid in the "processing" of traumatic memories previously inaccessible to the "conscious mind." However, no proposed neuropsychological mechanism driving the proposed "processing" or how this could alleviate distress has been proposed, and there is of yet no research demonstrating that brain activity associated with undergoing EMDR reflects brain activity during REM sleep (Lilienfeld, 2008). Also of note is the disconnect between involuntary REM sleep eye movements and the smooth, voluntary visual tracking of stimuli in EMDR (Lohr, Tolin, & Lilienfeld, 1998), a qualitative biological comparison gap never broached by EMDR advocates. In addition, Shapiro's contention that traumatic memories can be repressed or blocked (i.e., in a manner different from forgetting) is itself a deeply controversial claim that lacks a foundation of systematic scientific research support outside of clinical folklore and confected anecdotes (McNally, 2004).
EMDR was even featured during a 1995 ABC News 20/20 segment (Walters, 1995), which relied on appeals to authority (i.e., the proclaimed expertise of psychologist "EMDR sounds like utter nonsense, but this weird thing has a profound effect on people" (Marsa, 2002, March 25). Other dramatic and colorful claims include the supposed ability of EMDR to "pinpoint a specific trauma and target that like a laser beam" (Marsa, 2002, March 25).
By the mid-1990s, over 14,000 psychotherapists had been officially trained to administer EMDR in the United States and abroad (Bower, 1995 Another major methodological weakness characterizing Shapiro's earlier studies (e.g., Shapiro, 1989a) is the absence of control groups (see Lilienfeld, 2008).
Lacking a control group results in a failure to account for the hypothetical counterfactual, or how the patient's symptoms and distress would have fared in the absence of treatment administration (Dawes, 1994). For reasons often articulated in methodology and design texts (e.g., Campbell & Stanley, 1963), drawing strong conclusions about unique treatment effects from uncontrolled studies is problematic (e.g., inability to draw sound causal inferences, difficulty ruling out placebo effects, history effects, maturation, instrumentation, regression to the mean, spontaneous remission, etc.). In addition, combinatory treatments with impure independent variables (e.g., EMDR plus relaxation plus exposure) obfuscate unique effects attributable to BSS, which remains a key component of EMDR purported to substantially decrease psychological distress (Shapiro & Forrest, 2004).
Lohr and colleagues (1998) reviewed 17 group-design investigations of sounder methodological quality and rigor (e.g., inclusion of random assignment and dismantling designs) compared to previous uncontrolled designs (e.g., Shapiro, 1989b). These authors uncovered systematic evidentiary trends directly contradicting claims of superior efficacy made by EMDR advocates, such as (a) effect size equivalence (Cohen's d = .90) across EMDR and non-EMDR exposure treatments; (b) lack of control for therapist by treatment confounds (e.g., therapist enthusiasm and allegiance to EMDR); (c) overreliance on participants' verbal reports of feeling better in the absence of behavioral and physiological measures; (d) lack of significant differences between EMDR and exposure controls (with stationary eye analogue) on behavioral or physiological indicators when they were used (e.g., heart rate, skin conductance, and blood pressure); and (e) lack of significant differences in reported symptom (e.g., Mississippi PTSD Scale) and associated distress (e.g., SUDs) reduction rates across EMDR versus exposure control conditions over time (posttreatment to six-month follow-up).
In the PTSD studies that Lohr and colleagues (1998) examined, although exposure controls and EMDR both yielded better outcomes on SUDs ratings, heart rate, and PTSD symptom ratings compared to no-exposure control groups, EMDR and exposure controls did not significantly differ on any outcome measures. In the reviewed panic studies, EMDR was more efficacious than no treatment but equivalent to no-movement bilateral stimulation analogues. In the reviewed specific phobia studies, although self-report fear reductions were significantly greater in EMDR conditions in some comparisons (i.e., Muris et al., 1997), participants assigned to in vivo exposure conditions showed greater improvement on avoidance indicators than participants in EMDR conditions, even when EMDR therapists' clinical experience surpassed that of the in vivo therapists.
The majority of Lohr and colleagues' (1998) conclusions from their comprehensive review have been independently corroborated in earlier (e.g., Foa & Meadows, 1997) as well as later (Albright & Thyer, 2010) studies and reviews. In the context of phobic avoidance, Antony and Barlow (2002) reported little to no behavioral (e.g., avoidance ratings made by researchers) or physiological (e.g., lower heart rate and blood pressure) evidence to corroborate patients' verbal reports of fear reduction. In other words, objective indicators have not been utilized to bolster the beneficial impact of EMDR on symptom alleviation. Rather, positive effects remain confined primarily to patients merely saying that they feel better, thus failing to meet triangulation standards (cf. Campbell, 1956).
In a recent study of 74 female rape victims with chronic PTSD symptoms (Rothbaum, Astin, & Marsteller, 2005), there were no significant differences in PTSD symptom reductions between EMDR and prolonged imaginal exposure (PE). This study had the added benefit of including self-report instruments (e.g., the PTSD Symptom Scale-Self Report and Impact of Event Scale-Revised) as well as structured and semi-structured clinical assessment interviews (e.g., the Clinician-Administered PTSD Scale and SCID) conducted by independent raters blind to treatment condition.
Treatment integrity ratings were also included.
It is important to note that a number of studies questioning EMDR efficacy and effectiveness can be classified as experimental dismantling studies 7 (e.g., Cahill, Carrigan, & Frueh, 1999;Renfrey & Spates, 1994), which are sometimes called additive or subtractive designs . These studies entail the removal of specific treatment elements (i.e., BSS) from otherwise intact treatment packages, which can be compared to treatments containing the component in question to determine if removal diminishes treatment efficacy (cf. Hart, Fann, & Novack, 2008).  In the Middle Eastern study (Jaberghaderi et al., 2004), 14 Iranian children with histories of sexual abuse were randomly assigned to either an exposure-based CBT condition (n = 7) or an EMDR condition (n = 7) to be treated for trauma symptoms. Comparisons of pre-and post-treatment measures of self-and otherreported PTSD symptoms alongside behavioral ratings made by parents and teachers revealed roughly equal efficacy of CBT and EMDR, but the authors concluded that EMDR was more efficient (e.g., fewer sessions and more rapid intra-session SUDs reductions). However, this latter conclusion is questionable given that the CBT condition required a minimum of 10 sessions with heavy psychoeducational requirements, whereas the EMDR condition categorically lacked these requirements with termination contingent on quickly reaching low SUDs thresholds. Similar favorable results were found for the EMDR intervention in the South American study (Adúriz et al., 2009), which recruited 124 schoolchildren who had been forced to evacuate their homes with their families due to severe flooding. Although SUDs ratings significantly decreased from pre-to post-treatment and PTSD symptoms (e.g., intrusion and avoidance symptoms) and remained significantly lower at 3-month follow-up, there was no CBT group or control group included for comparison.
Enhancing the methodological quality of and participant recruitment efforts associated with these sorts of studies is critical given the insufficient inclusion rates of ethnic minorities and non-North American samples in efficacy studies and small sample sizes-shortcomings that have continued to hamper generalizability (i.e., external validity) of results to ethno-culturally diverse populations (Miranda, Nakamura, & Bernal, 2003).

EMDR: A concluding summary of the evidence.
The aforementioned studies all converge on the same conclusion: The observed effectiveness of EMDR in the extant literature can be reasonably attributed to its imaginal exposure component and not BSS (Antony & Barlow, 2002;Davidson & Parker, 2001). Evidence supporting the therapeutic contribution of bilateral sensory stimulation components (i.e., lateral ocular movements and alternating visual field stimulation) is either weak or non-existent (Davidson & Parker, 2001;Lohr et al., 2003). When writing their comprehensive review of state-of-the-art interventions for PTSD, Keane and Barlow (2002) asserted that no existing study to date had demonstrated incremental efficacy of EMDR over and above any existing evidence-based treatment for PTSD (i.e., anxiety management training, cognitive restructuring techniques, and imaginal exposures).
Nearly thirteen years later, this observation remains unchallenged by methodologically sound studies. Furthermore, compared to individuals who have undergone EMDR, those who receive traditional CBT for PTSD symptoms have attained greater treatment gains as evidenced by both post-treatment and follow-up assessments (Devilly & Spence, 1999). Until a coherent synthesis of data contradicts this body of evidence, there would seem to be no defensible rationale for replacing tried-and-true CBT techniques with EMDR.
Of note, this review illustrates EMDR proponents' striking neglect of rigorous scientific inquiry into purported mechanisms of change through various pseudoscientific maneuvers, namely, (a) dispensing with the proper methodological toolkits associated with efficacy and effectiveness research; (b) repeatedly exaggerating unfounded claims that a specific, unsupported treatment element (in this case, BSS) works; and (c) ignoring extant evidence that the impact of BSS appears to be no greater than non-specific or placebo effects at best. Thus, the BSS component of EMDR may be viewed as pseudoscientific as a function of its inertness in the inseparable context of intransigent confirmatory attitudes of avid proponents as well as the scientific implausibility of the proposed theoretical underpinnings. This illustrates a possible compatibilist stance between the distracting debates on whether we should focus exclusively on pseudoscientists (e.g., Derksen, 1993;Tolin, 2013, May 28) or pseudoscientific content (e.g., Bunge, 1984, fall;Lilienfeld et al., 2003). In this sense, the pseudoscience label may provide a concise and informative descriptive heuristic, as it is not being used in a cavalier, dismissive, or inflammatory ad hominem manner.
The burden of elucidating and laying the ground for testing proposed neuropsychological mechanisms of action of EMDR lies with the claimants. In lieu of evasive, ad hoc defensive maneuvering, such as Shapiro's revisionist statement suggesting that eye movement is neither a necessary nor sufficient treatment component (see Lohr, Hooke, Gist, & Tolin, 2003), EMDR proponents should try to develop a concrete set of testable hypotheses nested within a clear, coherent rationale drawing from the extant research literature on anxiety and trauma. However, given the data already scrutinized, it may be argued that such a step would thrust us squarely into a fallacy of misplaced rationalism (Sheaffer, 2008). In other words, EMDR enthusiasts may be attempting to speciously explain an inert, non-existent phenomenon from a position of post hoc rationalization, thus further exacerbating the pseudoscientific practice of immunization from falsification (see Bunge, 1984, fall).
In this sense, future research directions are not entirely clear.

Pseudoscience: What's the Harm?
Interventions based on pseudoscience may not merely contain inert components that fail to provide benefits (e.g., as observed with BSS in EMDR)-rather, they may harm clients. Overall, psychotherapy researchers have paid scant attention to baneful consequences of psychological treatments in recent decades, although within the past 5 years, some literature has focused on raising awareness of potentially harmful treatments (or PHTs; Castonguay et al., 2010) and how to detect and address such effects (Dimidjian & Hollon, 2010). Moreover, some psychologists support the creation of an official list of PHTs similar to existing EST lists (Castonguay et al., 2010;Lilienfeld, 2007).
CISD is a 3-4-hour, single-session group psychotherapy procedure in which clients openly disclose distressing thoughts and feelings in the aftermath of a potentially traumatic event, assumedly to avoid the onset of PTSD symptoms (Lohr, Hook, Gist, & Tolin, 2003). According to the guidelines of CISD, clients must (a) participate in therapy no later than 24 to 72 hours after the trauma, (b) discourage one another from leaving the therapy group once it has begun, and (c) discuss possible PTSD symptoms that they may face as a result of the traumatic event (Lohr & Fowler, 2002, summer;Lohr et al., 2003). Of note, CISD has been found to be consistently ineffective at best and possibly harmful at worst (e.g., worsened PTSD symptoms in CISD groups compared to assessment-only controls) in treating PTSD symptoms across both meta-analyses (e.g., effect size of d = -.11 ;Litz, Gray, Bryant, & Adler, 2002) and RCTs (e.g., Bisson, Jenkins, Alexander, & Bannister, 1997;Mayou, Ehlers, & Hobbs, 2000). Some researchers have hypothesized that observed PTSD symptom exacerbation at follow-up may be partly due to interference with natural symptom remission (Gist & Woodall, 1995). As opposed to using systematic desensitization (with the goal being gradual habituation) in tandem with practicing alternative adaptive responses to alleviate symptoms associated with distressing aspects of a trauma (as is the case in evidence-based CBT approaches; Foa, Zoellner, & Feeny, 2006), CISD instead encourages therapists to ask direct questions about the worst aspects of the trauma during the reactions/cathartic ventilation phase (i.e., shortly after the trauma) in the absence of teaching coping techniques (Devilly, Gist, & Cotton, 2006). This may be especially problematic for subgroups of patients who struggle with dysregulated hyperarousal (Devilly et al., 2006).
RT is a type of attachment therapy that has been flagged as potentially dangerous depending on how it is practiced (Mercer, 2008 The Science of Everlasting Life). Orr proposed that human birth is always accompanied by a fear of suffocation triggered by the premature severing of the umbilical cord, which supposedly damages the person's "breathing mechanism" and embeds panic deep into the subconscious mind (Singer & Lalich, 1996, pp. 42-43).
This repressed fear purportedly resurfaces in the form of both psychological (e.g., anxiety, depression, and low self-esteem) and physical ailments (e.g., allergies, weight problems, and cancer), and Orr proclaimed that several 2-hour "rebirthing" sessions (e.g., practicing patterned breathing techniques while floating or snorkeling in a hot tub) is usually sufficient for healing these conditions as well as fostering psychic abilities (Singer & Lalich, 1996, pp. 42-44).
Other variants of rebirthing include dramatic "recapitulations" (as described in the psychoanalytic attachment literature) of the birth process by wrapping up clients, especially young children with developmental problems, in carpets or blankets while squeezing them, taunting them, and encouraging them to struggle free (Lilienfeld, 2007;Mercer, 2008). Not only do these preposterous and needlessly abusive techniques lack any supporting research evidence (e.g., no RCTs have been conducted, and no evidence of efficacy or effectiveness can be found in the peer-reviewed literature), but they also have resulted in reported serious injuries and even deaths, including the asphyxiation of a 10-year-old girl in Colorado in 2001 (Mercer, 2008).
Of note, the two social workers responsible for the girl's death were sentenced to 16 years in prison, and a new legal mandate known as Candace's law, which prohibits restraint in psychotherapy, was passed in Colorado and North Carolina shortly thereafter (Josefson, 2001;Mercer, 2008).
Finally, RMT entails the use of various highly questionable, poorly supported techniques (e.g., hypnosis, "age regression," sodium pentathol administration, guided imagery, and/or therapist interpretations of symptoms) to recover memories of traumatic past events assumed to have taken place during a client's childhood (Lynn, Loftus, Lilienfeld, & Lock, 2008). As summarized elsewhere (Loftus, 1993;Lynn, Lock, Loftus, Krackow, & Lilienfeld, 2003;Lynn et al., 2008;Singer & Lalich, 1996;Singer & Nievod, 2003), these memories supposedly become "repressed" deep into the unconscious mind due to the intense trauma and emotional pain associated with aversive early experiences. The supposed recovered memories are frequently of questionable veracity and have a number of bizarre cottage industry therapy movements associated with them (e.g., Satanic ritual abuse therapy, alien abduction therapy, past-life regression, and entities therapy; see Singer & Nievod, 2003). RMT advocates (e.g., Fredrickson, 1992)  Sadly, RMT has resulted in a multitude of wrongful prosecutions of and civil lawsuits against parents who allegedly sexually abused their children (Loftus, 1995;Maran, 2010;Wakefield & Underwager, 1992), evidence for which was gathered during therapy sessions using the questionable procedures mentioned in the previous paragraph (see also Loftus & Ketcham, 1994). Of note, many of these unfortunate accusations occurred in the heat of a mass hysteria known as the "Satanic Panic," which swept across the United States during the 1980s and early 1990s (Nathan & Snedeker, 2001;Victor, 1993). During this period, popular daytime television talk shows (e.g., Geraldo Rivera, Oprah Winfrey, Sally Jesse Raphael, and Donahue) dramatically belabored the supposed existence of a large, clandestine sect of Satanists gleefully involved in routine macabre activities, including human sacrifices, animal mutilations, desecrations of religious buildings, cannibalism, the distribution of illicit drugs, kidnapping, pedophilia, and the production and distribution of child pornography (Victor, 1993). In professional mental health circles, popular therapy manuals, such as Bass and Davis' (1990) The Courage to Heal Workbook, further perpetuated dangerous and irresponsible claims, such as the classic assertions, "If you are unable to remember any specific instances… but still have a feeling that something abusive happened to you, it probably did" (p. 21), and, "If you think you were abused and your life shows the symptoms, then you were" (p. 22). One particular author (Cautin, 2011)  involves lightly tapping meridian points at various locations on the body with one's fingers while voicing positive self-affirmations, the purported goal being to unblock thought field "energy" obstructed by trauma (Feinstein, 2008). As for the second point, for a large subset of people experiencing typical bereavement (e.g., dysphoria following the death of a close family member), grief counseling appears to be associated with a clear deterioration of psychological and behavioral functioning posttreatment (compared to no treatment) according to a meta-analysis of RCTs (Neimeyer, 2000). However, grief counseling itself does not consist of pseudoscientific approaches per se (e.g., fostering social and familial support, reinforcing meaning making associated with death, and reflecting on positive memories) and may yield more beneficial effects for complicated grief (see Allumbaugh & Hoyt, 1999;Altmaier, 2011;Bonanno & Lilienfeld, 2008).
These considerations aside, practicing clinicians serve their clients best by avoiding harm to the extent possible (i.e., when it is foreseeable; APA, 2002) and using treatments supported by scientific evidence. Deferring to personal preferences for treatments (e.g., TFT for PTSD) and tenaciously maintaining that a suboptimal treatment choice is justified because it has not been shown to be associated with harm arguably constitutes unethical clinical practice, especially when the current state of the evidence is ignored. Unfortunately, however, the current APA Ethics Code (2002) explicitly confers equal status to professional judgment and peer-reviewed research findings in both clinical practice and pedagogical decisions, which is unjustified given the large clinical decision-making literature.

Clinical Decision Making
The clinical method and the actuarial method (Dawes, Faust, & Meehl, 1989) are the two central decision-making methodologies for predicting behavior or outcomes in clinical psychology. The clinical method entails an "in the head" or impressionistic synthesis of information to arrive at a conclusion, whereas the actuarial method relies on empirically-established relations between information or data (e.g., frequencies) and the outcome of interest, which are analyzed formally (e.g., using mathematical formulae or tables; Grove & Meehl, 1996) to reach a conclusion or probability statement. Since the mid-twentieth century, literally hundreds of studies in the decision-making literature have converged on the conclusion that the actuarial method almost always equals or exceeds the clinical method in accuracy (e.g., when predicting the presence or absence of a diagnosis), sometimes by a small margin and sometimes a considerable margin, and hence is the superior method overall (Goldberg, 1965;1968;Dawes, 1971;Einhorn, 1972;Meehl, 1954;Sawyer, 1966). This clear-cut trend has continued to emerge in meta-analyses as well (e.g., Aegisdóttir et al., 2006;Grove et al., 2000).
It is evident across this vast literature that many applied clinicians either ignore or inappropriately countervail readily available actuarial data and instead defer to clinical judgment. To the author's knowledge, there are no formal investigations of individual cognitive and psychosocial factors undergirding this irrational decisional intransigence, although Meyer, Baker, and Baker (2012, March) summarized possible misguided epistemological justifications for overreliance on the clinical method.
These include (but are not limited to) the following: (a) the fallacy of commensurate complexity, or the erroneous notion that human behavior is so complex that only equally complex methodologies could provide accurate prediction (cf. Faust, 2007); (b) the fallacy of argumentum ad experientiam, or stubbornly deferring to one's own self-assumed clinical expertise and experience as the best method for predicting outcomes; (c) an illusion of perfect predictability or consistent error avoidance via faulty methods, also framed as a steadfast resistance to accepting error to yield less error (see Einhorn, 1986); and (d) avowal of the ecumenical decree that no controversy exists-one can simply integrate both methods (Grove & Meehl, 1996), presumably even if they yield contradictory or mutually incompatible outcomes.
However, such postulated reasons for resistance to the use of actuarial methods remain to be tested and elucidated through a systematic program of research with clinical practitioners as the target population.
Despite the robust limitations of expert clinical judgment, including studies illustrating a weak relationship between clinical experience and accuracy 8 (see Dawes, 1989;Lilienfeld et al., 2003), commensurate levels of accuracy among novices and experts on a number of judgment tasks (especially when clinicians lack sound scientific evidence or ignore it; Goldberg, 1968;Weck, Weigel, Richtberg, & Stangier, 2011), and observed deterioration of knowledge once the practitioner completes formal education and training (Vollmer, Spada, Caspar, & Burn, 2013), this has not stopped many clinical psychology authors from professing an exaggerated faith in clinical expertise. Some even go so far as to declare by fiat that many years of clinical experience render one a more effective therapist in the absence of supporting data (e.g., Betan & Blinder, 2010). For example, Overholser (2010) (Neimeyer, Taylor, Rozensky, & Cox, 2014). In other words, after roughly 9-11 years pass following the completion of their doctoral degree, 50% of what psychologists learned in graduate school will become obsolete. Thus, ironically, the very encouraging finding that knowledge is advancing rapidly in psychology is potentially offset by the extent to which such knowledge may be routinely disregarded or replaced with pseudoscientific or insufficiently validated beliefs.
With reference to explicit overvaluation of clinical experience and expertise, Betan and Binder (2010) recently introduced metabolizing theory, which presumes that "expert" therapists differ from novices in that they engage in a flexible, adaptive, and accurate intuitive synthesis of theoretical and clinical knowledge outside of conscious awareness when formulating case conceptualizations and treating clients. In contrast, novice or non-expert therapists (i.e., those with far fewer years of accrued clinical experience) presumably lack a metabolic, intuitive grasp of core concepts.
Despite their apparently genuine convictions, the authors did not reference a single study supporting these bold assertions. Such assertions are likely better categorized as questionable conjecture or hypothesis as opposed to a theory given the absence of supporting evidence and the overwhelming presence of negative findings that contradict such optimistic pronouncements about the superior levels of accuracy achieved through experienced clinical judgment (Garb, 1998). Strongly worded yet evidentially hollow claims of this sort pertaining to clinical expertise may inadvertently instill a sense of professional complacency and provide a disincentive to keep up with relevant scientific research and/or utilize evidence-based tools.
All of this is not to say, however, that clinical judgment should be the nemesis of applied psychologists. Clinical intuition, expertise, and judgment play valuable roles in the context of discovery (e.g., hypothesis generation) and should not be considered second-class given their role in advancing the field (Lilienfeld, 2010;Chambless, 2014). Practitioners who draw from their rich clinical experiences in formulating case studies or presentations and subsequently utilize this knowledge to propose clear, sensible, and testable hypotheses for future research are constructively contributing to the true integration of science and practice (Lowman, 2012). Some of these hypotheses turn out to be correct (or nearly so), and at present, it is difficult to conceive of other ways by which novel clinical ideas could be generated. As a case in point, Chambless (2014)  informative dialogue between practitioners and researchers may help close the widening research-practice gap by having practitioners assist as research "problem finders," whereas researchers would primarily serve as "problem solvers," although these roles blend together for some psychologists (Chambless, 2014;Goldfried et al., 2014). Nevertheless, these same mental processes that shape initial clinical impressions and fledgling research agendas are certainly not immune from the pervasive influence of cognitive biases (e.g., confirmation bias), especially in the context of justification (Lilienfeld, 2010).

Cognitive Characteristics of Clinicians
As acknowledged in the decision-making literature, clinical psychologists routinely face complex and ambiguous scenarios that demand fairly rapid decisions about specific courses of action (e.g., diagnosis, suicide and violence risk assessment, intervention selection, and prognostic forecasting; Oltmanns & Klonsky, 2007).
Unfortunately, psychologists in applied practice are rarely given immediate corrective feedback on their decisions about specific scenarios followed by opportunities for repeated practice (i.e., deliberate practice; Ericsson, 2005;Lewandowsky, Little, & Kalish, 2007), which are known to be invaluable conditions for establishing expert levels of decisional accuracy (Ashby & O'Brien, 2005). An example of a field in which perceptual expertise is developed through repeated deliberate practice is ornithology, where experts gradually learn how to quickly and accurately identify birds at subordinate levels of representation (Krigolson, Pierce, Holroyd, & Tanaka, 2008). A skilled ornithologist, for example, would be able to identify an American flamingo as belonging to the species Phoenicopterus ruber as rapidly as a novice could identify it as a pink flamingo. In contrast, clinical practitioners typically cannot quickly and accurately diagnose a client based on readily observable pathognomonic features, not necessarily due to lack of competence, but also attributable to the complex heterogeneity of mental illness manifestation 9 (Seaton et al., 1999).
Furthermore, they often lack access to accurate corrective diagnostic feedback in their work settings in the same way that an ornithologist could compare observed birds to a textbook of representative exemplars, although professional consultation may assist with diagnostic accuracy depending on the nature and quality of the feedback. The feedback they do receive often contains a fairly large error component, which can greatly diminish the benefits of experience and easily foster mistaken belief (Dawes, 1989).
Faced with these environmental pressures (e.g., obstacles to proper experiential learning in a fast-paced work environment), clinical psychologists are understandably susceptible to a pernicious host of cognitive biases and heuristics, which may result in suboptimal clinical decision-making procedures, overconfidence, an illusion of learning, and decreased judgmental accuracy (Arkes, 1981;Dawes, 1994;Dawes et al., 1989; see also Tversky & Kahneman, 1974). As Meehl (1993) more forcefully stated, "It is absurd, as well as arrogant, to pretend that acquiring a Ph.D. somehow immunizes me from the errors of sampling, perception, recording, retention, retrieval, and inference to which the human mind is subject" (pp. 728). Generally speaking, cognitive heuristics can facilitate rapid adaptive choices in an overwhelmingly complex world, but they may also muddle perceptions and reinforce false beliefs (Kahneman, 2011;Tversky & Kahneman, 1974).
Cognitive and personologic variables contributing to clinician susceptibility to pseudoscientific beliefs are not well researched or formally understood, but it is plausible to hypothesize that they do so largely because of the ubiquitous human vulnerability to cognitive biases (and, more generally, irrationality) alluded to in previous paragraphs 10 . For example, following their initial attraction to and utilization of pseudoscientific treatments due to whatever combination of biases and proclivities, clinicians may subsequently fall into what social psychologist Anthony Pratkanis (1995, July/August, p. 21) calls a rationalization trap. This entails developing a gradual personal commitment to (or, stated another way, building an emotional investment in) the core principles of the intervention, which may be facilitated by various cognitive biases and stressors (e.g., confirmation bias and cognitive dissonance). Of course, cognitive bias susceptibility and illogical thinking may be more acutely amplified in some individuals for whatever reasons. Some studies have found, for example, that individuals who believe more strongly in the paranormal tend to commit more logical and probabilistic judgment errors than their more skeptical peers (see Majima, 2012).
Interestingly, in tandem with the rationalization trap, some clinicians may become so committed to particular theoretical frameworks that they begin to experience the following phenomenology: (a) ownership of their position, (b) perceiving their position as an extended part of their self-concept, and (c) subsequently perceiving any criticisms of the underlying conceptual and/or evidentiary apparatus as an attack on themselves (e.g., akin to an erroneously perceived ad hominem attack; see de Dreu & van Knippenberg [2005] for preliminary experimental evidence for this possibility). Involvement with "granfalloons" (Vonnegut, 1973, as cited in Pratkanis, 1995, July/August, p. 22), or "proud and meaningless" in-groups emphasizing a cohesive social identity associated with shared beliefs and jargon, may further reinforce wayward clinicians' commitment to favored interventions and/or theoretical frameworks (e.g., see McNally [1999] for a telling description of the EMDR Institute, Inc., and its members' activities). This may also serve to isolate them from informed skeptics by reinforcing perceptions of them as hostile out-group members (Pratkanis, 1995, July/August).
Regrettably, exposure to general higher education alone does not appear to be a sufficient buffer against the proliferation of pseudoscientific and paranormal beliefs.
For example, a paranormal beliefs survey given to 133 allied health students (e.g., undergraduate and graduate students in the fields of physical therapy, medical technology, and health administration) from two universities (including an Ivy League university) revealed the following: belief in extrasensory perception (46%), perceived legitimacy of chiropractic medicine (36%), and claims of telepathic experiences (25%) (Duncan, Donnelly, Nicholson, & Hees, 1992 (Wesp & Montgomery, 1998). Additionally, significant reductions in paranormal beliefs found among students in an undergraduate science seminar and a pseudoscience seminar (compared to a quasi-control group) were maintained over a 2- year period following the end of the courses (Dougherty, 2004;Morier & Keeports, 1994). Perhaps similar improvements might be attained were clinical psychologists exposed to educational interventions designed to strengthen relevant critical thinking skills and the ability to differentiate scientific from pseudoscientific therapeutic claims, although this has yet to be formally tested.
Baker and colleagues (2008)  Of note, factors "a" (intuitive decision-making preferences) and "d" (ambivalence or, in some cases, hostility toward science) in particular have been repeated foci of discussion in the pseudoscience literature in clinical psychology (Lilienfeld, Lynn, & Lohr, 2003) and speak to important cognitive-emotional preferences of applied psychologists. However, as Garb and Boyle (2003, p. 30) noted, no studies to their knowledge (or the current author's knowledge) have attempted to qualitatively or quantitatively investigate these features, especially as they relate to specific behavioral consequences (e.g., the aforementioned factors "b" and "c") pertinent to professionally and ethically responsible mental health care.
In addition, apart from a small number of recent surveys, which are limited in scope (e.g., Sharp et al., 2008), there is limited understanding of how critical thinking skills and cognitive styles relevant to clinical decision-making relate to knowledge about evidence-based and pseudoscientific treatments. These variables are worthy of further research given the proliferation of highly questionable and, in many cases, suboptimal interventions offered to mental health clients. As Lilienfeld (2010, p. 283) noted in his review of the treatment literature, which is only a small snapshot of the pervasiveness of the problem at hand, (a) tens of thousands of clinicians have received EMDR training (cf. Bower, 1995;Shapiro, 2004); (b) most clients suffering from anxiety, mood, eating, and autism spectrum disorders do not receive scientifically supported psychotherapies, and (c) increasing numbers of clients suffering from mental illness receive unsupported and questionable interventions, such as "energy therapies" (e.g., TFT). Although a number of studies lend credence to the claim that poor critical and scientific thinking skills are associated with stronger beliefs in pseudoscience and superstition among high school and college-age students (e.g., Bennett, 1991;McKenzie, 1986), this has yet to be examined among applied psychologists in active clinical practice.

Study Aims and Hypotheses
Given the absence of an existing measure, the primary objective of this study was to develop a questionnaire ( and (d) the ability to read, write, and understand English.
Of the 2,000 questionnaires mailed, 345 surveys were returned completed in full or part (see further below for details). In addition, 126 unopened survey packets were received by return mail, indicating outdated practitioner addresses (e.g., practitioners who had relocated or were deceased as noted on the envelopes), and 12 psychologists contacted the student investigator directly by telephone or email to decline participation for various reasons (e.g., they were retired, no longer in applied clinical practice, no longer licensed, or did not have time to complete the survey). within the sample (see Table 1 for a more detailed summary of demographic and professional characteristics of participants).  Dawes, 1994;Dawes et al., 1989;Meehl, 1954 Lilienfeld et al., 2008, p. 25), and substantially more public attention is garnered by highly questionable clinical techniques, such as pastlife regression, TFT, and rebirthing vis-à-vis more evidence-based practices (Olatunji, Parker, & Lohr, 2006).

Rational-Experiential Inventory (REI): Selected Items.
The REI is a widely used questionnaire validated for differentiating rational and intuitive thinking styles (Epstein, Pacini, Denes-Raj, & Heier, 1996;Pacini & Epstein, 1999). All items are rated along a five-point Likert-type scale where 1 = "Definitely False," 3 = "Undecided or Equally True and False," and 5 = "Definitely True." The REI contains two negligibly correlated scales: Rationality (formerly the Need for Cognition scale), which measures rational thinking preferences (e.g., "I prefer complex problems to simple problems"), and Experientiality (formerly the Faith in Intuition scale), which assesses more intuitive thinking styles (e.g., "I believe in trusting my hunches"). To reduce questionnaire length, the present study included five items from the Experiential Ability (EA; α = .80) subscale of the Experientiality scale, which refers to perceiving oneself as having sophisticated intuitive skills (e.g., "I hardly ever go wrong when I listen to my deepest 'gut feelings' to find an answer"); and five items from the Experiential Engagement (EE; α = .79) subscale of the Experientiality scale, which refers to a preference for and enjoyment of intuitive decision making (e.g., "I like to rely on my intuitive impressions") (EA-EE subscale inter-correlation = .62).
These items were selected on the basis of the magnitude of item-total correlations (range of rs = .47 to .73; Björklund & Bäckström, 2008), magnitude of respective factor loadings (range = .50 to .66; Pacini & Epstein, 1999), and their perceived relevance to intuitive clinical decision making. Questionnaire instructions were slightly modified to associate each item with clinical practice activities.
Critical Thinking Questionnaire (CTQ). The CTQ was designed to evaluate practicing psychotherapists' critical thinking abilities (Gaudiano, et al., 2011;Sharp et al., 2008). It consists of 28 items with multiple-choice format response scales and has a total score range of 0 to 28; total scores are calculated by summing the number of correct responses. Kuder-Richardson Formula 20 (KR20) estimates of internal consistency appear adequate (KR20 = .70; Sharp et al., 2008), although additional psychometric evaluation is needed. Of the five CTQ subscales, three were selected for the current study (viz., Inference, Deduction, and Interpretation, comprising 14 items total) in light of (a) Sharp et al.'s (2008) observation that Deduction (4 items) and Interpretation (7 items Inference questions challenge respondents to discern degrees of accuracy of inferences drawn from available data, Deduction problems test the ability to ascertain whether certain conclusions necessarily follow from given premises, and Interpretation questions require drawing accurate conclusions and generalizations upon weighing available evidence (Sharp et al., 2008). Of note, most CTQ questions originally appeared in two well-validated critical thinking measures-the Watson-Glaser Critical Thinking Assessment (Watson & Glaser, 1994) and the Cornell Critical Thinking Test (Ennis, Millman, & Tomko, 1985). Two Inference subscale items were adapted from Stanovich's (2003) textbook on critical thinking.

Procedure
Prior to participant recruitment, the initial CAKQ item pool was distributed to four nationally recognized psychologists with scholarly expertise in the delineation of scientific from pseudoscientific subject matter in clinical psychology. All four experts Most, if not all, major psychopathology ultimately has its roots in low self-esteem").
The latter change was made in light of the observation that potentially pseudoscientific assessment practices comprise a separate domain of inquiry that is equally as complex as the intervention topic, and studying clinician knowledge of such practices was deemed best reserved for a future study. Thus, the final version of the CAKQ contained 41 items.
In light of the reliance of previous research on online listservs (i.e., e-mail advertisements) and membership rosters associated with professional psychological organizations (e.g., Gaudiano et al., 2011;Sharp et al., 2008), this study attempted to reduce sampling error by recruiting participants from the northeastern U.S. irrespective of professional membership status. New England was specifically chosen for recruitment due to the known effects of physical proximity and personalization on survey return rates (i.e., the location of University of Rhode Island [URI] in a New England state and the personal relevance of practicing psychology in the Northeast; see Green & Kvidahl, 1989;Heerwegh, Vanhove, Matthijs, & Loosveldt, 2005;Sonne-Holm, Sørensen, Jensen, & Schnohr, 1989 Michalski & Kohout, 2011;Sharp et al., 2008).
An explanatory cover letter, URI Institutional Review Board (IRB)-approved informed consent form, paper questionnaire packet (see Appendix), and pre-stamped return envelope were sent to invited participants' mailing addresses via U.S. Post Office First-Class Mail. All returned packets were screened for degree of completion and adherence to inclusion criteria (e.g., checking weekly clinical hours and respondents' hand-written notes for indications of retirement). Questionnaire data were entered into an SPSS Statistics 22.0 database for statistical analyses (see Data Analysis section below for details).

Data Analysis
Data from 324 participants were used for all statistical analyses. All but five CAKQ items were included in the analyses (n = 36 items). The five excluded items comprised the attitudes toward science category, which elicited personal attitudes (versus research-corroborated knowledge) and were not included in the hypotheses of the current study. Prior to conducting the primary statistical analyses, 17 CAKQ items were recoded such that higher scores reflected greater knowledge of relevant clinical research, 4 REI items were recoded such that higher scores indicated a greater preference for intuitive decision making, and missing data frequencies were obtained for all study questionnaires (see Appendix for specific examples of recoded items). In addition, frequencies were obtained for NF responses on the CAKQ. NF responses were treated as missing data given that these responses were distinct from the CAKQ Likert scale options and thus were not accounted for by the metric of the provided response scale, and idiographic-level (i.e., participant-specific) NF rationales were not obtained in this study due to time and resource constraints.
The following procedures were subsequently used to evaluate whether NF responses in the CAKQ were best characterized as missing at random (MAR) or missing completely at random (MCAR) 11 (for technical details, see Enders, 2010;Rubin, 1976). First, a multiple regression analysis (MRA) was performed to ascertain whether demographic and professional background variables significantly predicted total NF responses across participants. Second, Little's (1988) MCAR test was conducted as an omnibus test of randomness to judge whether NF responses plus the small amount of non-NF missing data could be collectively classified as MCAR (i.e., comparing expected data patterns from a random missing data process to observed missing data patterns). Deciphering the degree of randomness present in missing data has important implications for judging whether data are sufficiently random to accommodate specific remedial techniques, such as maximum likelihood estimation in the event of non-randomness/MAR, or a family of missing data imputation approaches if MCAR holds (e.g., mean substitution, regression imputation, hot or cold deck imputation, and multiple imputation) (Little & Rubin, 1987).
Next, a preliminary investigation of the dimensionality of the CAKQ was conducted using principal components analysis (PCA). Of note, the final sample size was not sufficiently large to allow a random division into equivalent subsamples (ns = 162) for cross-validation purposes (e.g., using a combinatory exploratory and confirmatory factor analytic approach with structural equation modeling would be inadequately powered and requires larger sample sizes; cf. Brown, 2006

CAKQ Response Frequencies and Characteristics
Prior to recoding of CAKQ items, the Likert-scale response frequencies for each item were computed and are presented in Table 2. In addition, an alternative scoring procedure was carried out wherein responses to CAKQ items (n = 36) were coded either "correct" or "incorrect" in accordance with contemporary clinical research findings (see Chambless & Ollendick, 2001;Faust, 2012;Lambert, 2013;Lilienfeld et al., 2003 for reviews relevant to specific item content). Specifically, if participants (a) responded with either "Strongly Disagree" or "Somewhat Disagree" to scientifically unsubstantiated statements (e.g., "Past life regression is useful for identifying clients' traumatic memories prior to their birth"); or (b) responded with either "Strongly Agree" or "Somewhat Agree" to scientifically supported statements (e.g., "Exposure plus response prevention [ERP] is an effective psychological treatment for obsessive-compulsive disorder [OCD]"), their responses were considered "correct" and coded as 1s. Responses in the incorrect direction (e.g., either strongly or somewhat agreeing or disagreeing with scientifically unsubstantiated or substantiated statements, respectively), responses of "Neither Agree Nor Disagree," NF responses, and missing data were all counted as "incorrect" and coded as 0s. Total CAKQ scores were subsequently computed by summing the number of correct responses across the 36 items for each participant. The CAKQ total score binomial distribution appeared mesokurtic (G2 = -.10) with no substantial skewness (G1 = 0.20) noted.
Using the alternative scoring system, the mean CAKQ total score was 17.84 (SD = 4.87) out of a total possible score of 36 (i.e., an average score of 50% correct).
The most frequently correctly answered item was: "Psychological research has established cognitive-behavioral therapy [CBT] as an efficacious intervention for social anxiety disorder" (92% correct), and the most frequently incorrectly answered counting NF responses as incorrect may have rendered this particular scoring system overly punitive, which is why alternative procedures for handling these responses were subsequently considered 12 .

Missing Data and "Not Familiar" Response Analysis
Missing data frequencies were reviewed for the abbreviated CTQ (n = 14 items), the REI (n = 10 items), and for the CAKQ items included in the statistical analyses (n = 36). Across all participants, no CTQ responses were missing, 3 REI responses (< 0.1%) were missing, and 21 CAKQ responses (0.2%) were missing.
Thus, missing responses alone were not deemed a major concern.
NF responses were next analyzed for the CAKQ. NF responses comprised 15% of all CAKQ data collected, with 82% (n = 267) of all participants providing at least one NF response and 97% of all remaining CAKQ items (n = 35) having at least one NF response. Given this observation, a Pareto chart of the 36 CAKQ items (see Figure 1 for the bar chart and Table 2  With regard to an analysis of outliers among total NF scores, one case had a studentized residual of 3.26 (studentized deleted residual = 3.31), and seven cases evidenced Mahalanobis distances greater than 30 (D 2 values = 33 to 68). However, a review of these individual cases did not reveal any consistent features (e.g., relatively heterogeneous demographics, professional concentrations, years of experience, theoretical orientations, CTQ scores, and REI scores were observed). In addition, removing these cases and re-running the MRA did not yield meaningfully discrepant results.

Internal Consistency Estimates and Item-Total Correlations
Internal consistency estimates computed for the CAKQ (Likert-scale scores) and REI were adequate (Cronbach's αs = .76 and .88, respectively). Scale reliability for the CTQ, however, was poor (KR20 = .62) relative to previous estimates (e.g., Sharp et al., 2008), and correlations among subscales (i.e., Inference, Interpretation, and Deduction) were relatively small in magnitude (rs = .24 to .28). Corrected item-total correlations (i.e., with individual item values subtracted from the total score) ranged from negligible to moderate (rs = .022 to .41). For a complete listing of scale means, standard deviations, and intercorrelations, see Table 3.

Principal Components Analysis and Exploratory Factor Analysis
A PCA was performed on the CAKQ Likert-scale scores for the purposes of (a) seeking further justification for the use of a total score in subsequent analyses, and (b) preliminarily testing for the presence of the four hypothesized content categories (viz., pseudoscience, clinical judgment, evidence-based treatments, and general clinical knowledge). In light of the strength of observed CAKQ inter-scale and subscale-total score correlations (i.e., hypothesized scales moderately-to-strongly correlating with one another, rs = .20 to .60, as well as with total scores, rs = .50 to .80), an oblique (i.e., direct oblimin) rotation was selected. Scree test results indicated a possible two-component solution with observed eigenvalues as follows: 5.1, 2.4, 1.6, and 1.5 (variance explained = 14%, 7%, 5%, and 4%, respectively). However, the resulting pattern matrix revealed a largely uninterpretable solution as evidenced by inconsistent themes among within-component items (i.e., items 9, 13, 17, 20, 26, and 35 for component 1 and items 18, 20, 21, and 25 for component 2) as well as an abundance of salient (i.e., > .30) cross-loadings (range = .31 to .72). Based on these results, further examination of dimensionality was not pursued at this time, and the use of the CAKQ total score was deemed appropriate for subsequent analyses given the apparent absence of evidence for initially hypothesized CAKQ sub-domains within this sample.
An exploratory factor analysis (EFA) was next conducted on the 10-item REI for the purposes of (a) investigating the appropriateness of using subscale or total scores in subsequent analyses, and (b) corroborating the two-component (i.e., EE and EA) solution observed in the literature (e.g., Pacini & Epstein, 1999) using principal axis factoring. Oblique rotation (viz., direct oblimin) was used given the strong observed EE-EA inter-scale correlation (r = .70, p < .001). Factor selection and acceptability were guided by the scree test, solution interpretability, and strength of parameter estimates (i.e., primary factor loadings > .30). Scree test results clearly suggested a two-factor solution. Eigenvalues for the unreduced correlation matrix were as follows: 4.8, 1.0, .77, and .71 (variance explained = 48%, 10%, 8%, and 7%, respectively). Primary factor loadings for all 10 items were well above .30 (range = .46 to .80), no salient cross-loadings emerged, and the inter-factor correlation was strong (r = .71). The pattern matrix indicated that all EE items (i.e., REI items [6][7][8][9][10] loaded onto the first factor, and all but one EA item (i.e., REI item 3, which also loaded onto the first factor) loaded onto the second factor (see Table 4 for item means, standard deviations, factor loadings, and communalities). In view of the high EE-EA inter-factor correlation and their strong associations with the REI total score (rs = .93 and .90, respectively), the REI total score was used in subsequent analyses.

Multiple Regression Analyses
Two separate standard MRAs were conducted to determine how well intuitive preferences, critical thinking skills, and number of years of clinical experience predicted clinical knowledge. CAKQ Likert-scale total scores (with regression imputation applied to NF and missing data responses prior to summation) served as the outcome variable in both analyses. In the first analysis, predictor variables were REI total scores and CTQ subscale scores (i.e., Inference, Interpretation, and Deduction), which significantly predicted CAKQ scores, R 2 = .23 (adjusted R 2 = .22), F(4, 316) = 23.51, p < .001. Both REI total and the Inference subscale of the CTQ made significant contributions to the regression model, t(316) = -7.48, p < .001 (β = -.38), and t(316) = 3.01, p = .003 (β = .16), respectively. However, the Interpretation and Deduction subscales of the CTQ were not significant predictors (see Table 5 for a summary of regression results). Finally, the second MRA revealed that years of clinical experience did not significantly predict CAKQ scores, and the effect size was negligible (i.e., R 2 < .01). Consistent with expectations, a lower reliance on intuitive thinking styles (as measured by the total REI score) was associated with greater clinical knowledge. This in part may reflect an incompatibility between overreliance on "gut-level" intuitive judgments and deference to and/or affinity for seeking out relevant scientific research findings. For example, more intuitively inclined practitioners may be more naturally likely to trust their initial subjective judgments about newly encountered treatments and clinical claims, which may dampen motivation to obtain updated knowledge from contemporary research. Finally, the hypothesis that total number of years of clinical experience would not predict higher clinical knowledge scores was also upheld. This finding is consistent with previous research indicating that amount of hands-on clinical experience is all too often negligibly (if at all) related to professional competency (e.g., Goldberg, 1968;Lilienfeld et al., 2003;Vollmer et al., 2013;Weck et al., 2011).
With regard to the CAKQ NF response distribution (see Figure 1), participants generally indicated low levels of familiarity with (or perhaps were less confident in providing clear responses to) items addressing dubious clinical treatments (e.g., TFT and NLP) in contrast to high levels of familiarity with items tapping clinical judgment processes. The former observation may be partly explained by lower mainstream clinical practitioner exposure to pseudoscientific interventions than previously surmised (e.g., Olatunji et al., 2006). That is, many practitioners may be ignoring such approaches (or simply not reading about or otherwise researching them) for a variety of reasons, such as skepticism, lack of interest, lack of perceived applicability to individual professional practices, and/or feeling as if they already have a grasp of a sufficient repertoire of helpful therapeutic techniques. Another possibility is that many practitioners simply may not have been exposed to the pseudoscience literature given its relatively narrow niche in clinical psychology when compared to other widely known lines of scholarly inquiry (e.g., EBTs).
Professed familiarity with clinical judgment content (as judged by disproportionately fewer NF responses to these items relative to other items) is not surprising in light of the previously cited decision-making literature, which reveals consistent themes of practitioner overconfidence in empirically under-supported or suboptimal clinical decisional strategies. Current results suggest that self-perceived familiarity with clinical judgment content in particular is by no means a trustworthy proxy for accuracy of knowledge about clinical judgment, especially its wellresearched limitations. For example, despite considerable contrary evidence (e.g., see Grove et al., 2000;Sawyer, 1966), 94% of the sample agreed with the following Because volunteers comprised the entire sample, it is important to consider the possible impact of self-selection bias on results, which would limit generalizability.
Specifically, participants who chose to complete the survey (versus non-responders) may have felt more strongly or confidently about certain topics covered by questionnaire items (e.g., stronger beliefs about or preferences for or against evidencebased interventions, intuitive clinical judgment, the perceived importance of critical thinking in clinical contexts, etc.). Psychologists with stronger scientific leanings, for instance, may have been more willing to participate in a research study. Of note, an examination of central tendency and distribution characteristics did not reveal substantially skewed results for questionnaire scores, although the mean REI total score was slightly negatively skewed (M = 32, SD = 6.3), indicating a slightly disproportionate preference for intuitive clinical decision-making among respondents.
Returning for a moment to the observation that predominantly older psychologists in private or independent practice with 10 or more years of post-graduate clinical experience submitted completed surveys, it is possible that these individuals felt that survey content was more professionally salient or relevant for them, and/or perhaps they had more time and professional flexibility to complete the surveys compared to other psychologists (e.g., younger early-to-mid-career psychologists in other professional settings). Thus, the results of this dissertation may under-represent the belief profiles of younger (e.g., < 50 years of age) psychologists with 10 or fewer years of clinical service in non-private practice career settings. Future alternative sampling methodologies (e.g., stratified sampling or possibly cluster sampling from randomly selected hospitals, community health centers, and university clinical psychology departments) may assist with clarifying statistical relationships within a more demographically and professionally diverse group of clinical practitioners.
A noteworthy statistical concern within this study was the use of regression imputation to address NF responses (15% of all CAKQ data collected), which were treated as missing data. First, on a general note, regression imputation suffers from the shortcomings of overestimating correlations and underestimating variances (Little & Rubin, 1987), which may have biased study results. Multiple imputation (MI), however, is considered less biased than regression and mean imputation, in part due to the introduction of random noise into computation procedures (Allison, 2003).
However, MI is not without its challenges. It is a broad and highly technical approach to missing data requiring specialized software, and disagreements exist about how many imputations are necessary under which circumstances, with suggestions ranging from 2 to 510 imputations for obtaining "good" statistical results (Graham, Olchowski, Second, on a conceptual note, NF responses are inherently ambiguous in the sense that they are qualitatively different from agreeing with, disagreeing with, or taking a neutral position on a knowledge claim. The NF response option might be used, for example, by an intellectually cautious practitioner with rigorous clinical science training who may have knowledge about a particular intervention, but may not be comfortable responding in a perceived definitive manner. However, this same response may be liberally utilized by a less scientifically conscientious practitioner who fails to keep abreast of relevant clinical research literature. Thus, an optimal approach to analyzing NF responses is not readily apparent (especially without having respondents indicate their basis for selecting this response option), which is partly why they were treated as missing data for the current study.
Possible avenues for future scholarly inquiry may include questions such as the following: How might educators in the field of clinical psychology assist practitioners in developing and maintaining a professional attitude of skeptical open-mindedness toward known, typically used, and newly encountered approaches to assessment and treatment? With regard to ethical self-monitoring strategies, how might practitioners identify and attenuate personal biases and tendencies to yield to emotionally compelling and/or seemingly plausible clinical approaches in the face of conflicting evidence for their value? And more generally, what role might future psychoeducational strategies play in increasing adherence to the more effective appraisal systems science offers (e.g., deference to the current state of the evidence insofar as the peer-reviewed research literature reveals) given our well-known cognitive limitations when left to our own devices (e.g., the pervasive limitations of "gut-level" intuition when faced with complex and/or high-stakes clinical decisions affecting vulnerable clients [Dawes, 1994])?
Future research might also examine the presence and extent of clinicians' cognitive discipline during specially designed decision making tasks (e.g., degree of willingness to defer to research evidence over emotional convictions) as well as endorsed degrees of openness and imperviousness to evidence contrary to personally preferred or default modes of assessment and treatment. Given the many known human shortcomings in appraising evidence (Hastie & Dawes, 2010), reinforcement of the scientific method in clinical decision making in tandem with the capacity for informed recognition and rejection of largely ineffective clinical methodologies becomes a critical prophylactic in a field that continues to impact the lives of many individuals.
The current study has underscored the importance of continuing to study individual clinical practitioners' cognitive styles and characteristics, which appear to be underemphasized in the extant literature. As discussed earlier, a sanitized conceptual divorce of pseudoscientific content from the actions of individuals (which are in turn associated with cognitive and emotional styles, convictions, and proclivities) may not be a productive emphasis. The author respectfully disagrees, for example, with the apparently common position, which is observed even among stellar scholars in the field (e.g., McNally, 2003;Tolin, 2013, May 28), that the antidote to "bad science" is simply conducting more "good science." This move debatably courts the aforementioned compartmentalization strategy by predominantly emphasizing content dissemination. Its limitations may become more apparent by drawing the following analogy: If drug company A knowingly produces a pill repeatedly verified to be inert at best or harmful at worst (i.e., "bad pills"), all the while steadfastly maintaining that their pills work despite mutually incompatible evidence, is the appropriate sole response of the neighboring drug company B to continue mass producing pills repeatedly demonstrated to be effective (i.e., "good pills")? Although no one would disagree that good pills should continue to be improved upon and appropriately disseminated, it seems difficult to ignore the pernicious influence of drug company A, whose disseminative reach and public impact may not necessarily be checked by the circumscribed responsible actions of company B. In addition, framing this complex problem as a mere horserace between "bad science" and "good science" (or "good" or "bad" psychotherapy, assessment, pharmacotherapy, etc.) ignores the cognitive, emotional, and motivational machinery associated with the ongoing aggressive propagation of pseudoscience.
Bringing this analogy back to the present example of the ethical design, research, justification, practice, and eventual broad implementation of effective psychotherapy, it is unlikely that well-meaning, "good scientists" who conduct methodologically sound research in earnest (e.g., in Ivory Tower academia) will necessarily single-handedly deter or mitigate the influence of pseudoscience in society at large. In fact, despite the presence of excellent quality research and the availability of efficacious and effective treatments, pseudoscience has remained remarkably pervasive and popular in the public spheres (Lilienfeld, Lynn, & Lohr, 2003;Lilienfeld, Ruscio, & Lynn, 2008). It is here that the imposition of sanctions, despite the infrequency with which it is mentioned, would seem to merit careful consideration.
That is, should licensing boards and/or professional organizations such as the APA formally reprimand (or, in cases involving certain categories of legal damages, potentially withdraw licensure from) clinical practitioners who persist in using known pseudoscientific interventions demonstrated to be ineffective or harmful (e.g., see Lohr & Fowler, 2002, summer;Lowman, 2012)?
Other possible steps for combating the prevalence and influence of pseudoscience among applied psychologists include the following: (a) revamping and possibly standardizing current clinical psychology graduate-level curricula to include formal training in clinical versus actuarial judgment, cognitive biases relevant to clinical decision making, philosophy of science considerations (e.g., distinguishing scientific from unscientific approaches), and applied psychometrics in assessment; (b) developing lists of "psychotherapies to avoid" (p. 8) based on evaluative criteria consistently applied to treatment studies, which would be similar to existing lists of ESTs (Chambless & Hollon, 1998;Chambless & Ollendick, 2001); (c) organizing pseudoscience watchdog groups to respond to popular self-help material and outlandish claims in the media with the help of professional psychological organizations; and (d) setting more rigorous standards for continuing education (CE) credit courses necessary for maintaining licensure (Lilienfeld, 1998, fall;Lilienfeld, 2010;Lohr & Fowler, 2002, summer;. With regard to point "a" above, some psychologists (e.g.,  have argued that rote acquisition of facts has lamentably displaced learning core principles of scientific thinking in clinical psychology programs, and the APA has been loath to address this problem for far too long. Lohr and Fowler (2002, summer) Berenbaum & Shoham, 2011;Kaslow et al., 2004), standardizing and streamlining graduate curricula in line with the scientific literature may best be combined with an emphasis on supervised demonstrations of effective practical clinical applications of knowledge gleaned from research. This applied educational strategy may assist with improving the quality of early professionals' clinical training, which in turn may help check the influence of pseudoscience. Additionally, user-friendly treatment evaluation tools, such as the "Therapy Rating Scale" proposed by Worrall (1990) for parents and educators of children with learning disabilities, may be developed and tailored to assist clinicians with vetting novel treatments that pique their interest. Combined with the more rigorous educational training suggestions noted above, the dissemination of ready-to-use intervention evaluation tools (e.g., worksheets enumerating "full-stop" and "warning flag" criteria associated with specific treatment package features) among practitioners may help reinforce sorely needed applied critical thinking skills as well as diminish the influence of pseudoscience in applied practice.
Finally, on the front lines of actively practicing clinicians, making better use of available computer software to enhance the quality and ubiquity of helpful treatment outcome feedback is encouraged. For example, a fairly recent online psychotherapy outcome tracking program, the Systemic Therapy Inventory of Change (STIC), assesses and tracks both client progress and therapeutic alliance factors (Pinsof, Goldsmith, & Latta, 2012 (Lilienfeld, Lynn, & Lohr, 2003), we must not, as space journalist James E. Oberg once quipped, remain so open-minded that our "brains fall out" in the process (Sagan & Druyan, 1997). However, as eloquently articulated by the late Richard Feynman (1998), "…in life, in gaiety, in emotion, in human pleasures and pursuits, and in literature and so on, there is no need to be scientific" (p. 2). That is, one need not rigidly adhere to an all-pervasive, lockstep scientism across all facets of life, which would be nonsensical in many contexts (e.g., deciding which music to enjoy or which books to read for pleasure). Science can guide our efforts in constructing a laser and prolonging longevity, but as Hume's guillotine reminds us, it cannot necessarily tell us where we ought to aim the laser or FOOTNOTES 1 The subtitle of this section is a tribute to the title of Alan F. Chalmers' (1999) classic text on the monumental efforts of eminent philosophers of science to formulate generalized definitions of science, which have often fallen short (e.g., the problems of naïve falsificationism). 2 For example, in the early 1990s, a typical graduate school applicant was four times more likely to be accepted into a clinical Psy.D. program without tuition remission (compared to a clinical Ph.D. program), and students were enrolling in practitioner-oriented programs at a rate of nearly three times higher than enrollment rates in scientist-practitioner programs (Norcross, Hanych, & Terranova, 1996). As of 2010, the ratio of enrolled students to total number of applications submitted to APAaccredited Ph.D. programs in the U.S. was 5% compared to 18% for accredited Psy.D.
programs (Kohout & Wicherski, 2010, October).  4 Identification of scientifically plausible mechanisms of change embedded in treatment packages (e.g., specific treatment components directly responsible for allaying distress) is another invaluable contribution of clinical science (Laurenceau et al., 2007). Although an inability to isolate individual change mechanisms does not preclude using interventions otherwise shown to work, as has been the case with many psychopharmacological agents, this process may ultimately help protect against the undue crediting of non-specific treatment effects in many cases (Baker et al., 2008;Beyerstein, 1997). 5 Beyerstein (1997) appeared to consider "statistical significance" sufficient in this context, but it goes without saying that other important quantitative information (e.g., effect sizes, confidence intervals, and statistical power considerations) weighs heavily on the multifaceted problem of "meaningful" differences. Moreover, a litany of both logical (e.g., invalid application of modus tollens to probabilistic conditions via modal "rejection" or "failure to reject" a null hypothesis) and statistical (e.g., instances of insufficient power to detect genuine differences or over-sensitive designs whose results are interpreted as genuine effects without regard to effect sizes) problems have plagued null hypothesis significance testing (often abbreviated NHST) since its inception, which beclouds popularly discussed notions of theory confirmation and disconfirmation (e.g., the glaring disconnect between NHST results and pinpointing specific problems with theories). It is beyond the scope of this project to discuss these complex issues in detail; the interested reader is advised to consult Harlow, Mulaik, and Steiger (1997) as well as Morrison and Henkel (2009) for a more thorough overview of the problem. 6 Pseudoscience typically has been differentiated from other questionable forms of non-science in the extant literature. For example, "junk science" (see Edens et al., 2012;Huber, 1993) usually refers to dubious expert witness claims masquerading as informed scientific opinion in courtrooms, which is usually at odds with legal rules of evidence (e.g., Frye or Daubert standards depending on the state; Faigman & Monahan, 2009). "Quackery," on the other hand, has been defined by the U.S. Food and Drug Administration as "the commercialization of unproven, often worthless, and sometimes dangerous health products and procedures" (Young, 1988, p. 12). 7 An illustrative example of dismantling in the field of ethnobotany can be found in Wade Davis' (1985) popular book (and later, Hollywood horror film) The Serpent and the Rainbow. In this book, Davis detailed his infiltration into Haitian voodoo secret societies to research an allegedly magical and nefarious "zombi powder" used by local sorcerers to exact revenge on enemies. When ingested in food or drink, the powder supposedly caused people to die and return from the grave as zombies, to which local Haitian village habitants attested. However, upon further scrutiny, Davis (1983) discovered that the voodoo sorcerer's concoction contained a potent neurotoxin found in puffer fish called tetrodotoxin (TTX), which is known to cause a form of flaccid paralysis that gives the temporary appearance of death. Thus, although the superstitious zombification claim was debunked, Davis' (1983) component analysis (or "dismantling") style logic was noteworthy, for he demonstrated that the TTX alone (and not the human bone powder, tarantula innards, herbs, jimson weed, voodoo incantations whispered over the powder, etc.) was responsible for the observed temporary paralysis mistaken for death. 8 Some studies indicate that clinical experience sometimes may be negatively related to judgment accuracy. For example a study by Hermann and colleagues (1999), found that older psychiatrists with more years of clinical experience adhered to outdated knowledge about electroconvulsive therapy and were unaware of updated research on this subject. 9 Mathematician Morris Kline's (1967) apt summary statement definitely applies here, viz., "human nature is a more complicated structure than a mass sliding down an inclined plane or a bob vibrating on a spring" (p. 499). Much of psychological research relies on intangible theoretical concepts inferred through quantification procedures yielding latent constructs or latent variables (or, in statistical terms, linear composites; for detailed statistical treatments, see Bentler, 1980;Bollen, 2002;and Loehlin, 2004). Latent variables remain widely accepted in contemporary psychological research and are ubiquitous across theories and explanatory frameworks (Borsboom, Mellenbergh, & van Heerden, 2003). Reification fallacies notwithstanding, these unobservable constructs (e.g., personality features/types, cognitive abilities, and emotional states) are typically viewed as giving rise to (or, in the author's diverging opinion, are numerical Platonic summary statements of) conglomerates of observable and measurable behaviors (e.g., social proximity), test scores (e.g., IQ scores), and patterns of physical arousal (e.g., physiological indicators such as blood pressure elevation). In the context of classical test theory, the unobserved value of a latent variable under a given set of circumstances is the true score, which we attempt to mathematically approximate as best we can despite the inevitability of varying degrees of distortion attributable to measurement error (cf. DeVellis, 2003, p. 15). Complicating matters further, invoking a scientific realist interpretation of factor analytic results, which some researchers mistakenly do, we cannot declare with certainty that factors (or any of its constituent indicators, for that matter) share clean, isomorphic correspondences with their counterparts in reality, especially given the fragmentary, intangible, and uncontained nature of those counterparts (e.g., the diagnostic construct of depression as a finely fragmented array of various self-reported symptoms and observed behavioral signs). 10 Additional complicating factors include the difficulties often involved in reaching accurate conclusions as information at hand becomes increasingly complex and ambiguous, the salience and emotional seductiveness of privileging direct experience over more abstract research findings, and general human cognitive limitations. Given these considerations, erroneous conclusions, nonscientific leanings, and false beliefs arguably may be far more ubiquitous and influential than formally recognized by many psychologists (including Lilienfeld et al., 2003), perhaps even among those who otherwise perceive themselves as hardcore rationalists or scientists.
Roughly speaking, clinicians are not exempt from being influenced by their own evolutionarily-shaped neurobiological wiring programmed to respond to immediate or near-immediate environmental circumstances with snap judgments. This reasoning is important to consider when one is tempted to attribute the false or otherwise unorthodox beliefs of clinicians to poor moral character and/or easily remedied "irrationality," which may contribute to unjustifiably hasty and damaging characterological generalizations. 11 Note that this terminology is superficially misleading. That is, MAR does not refer to haphazardly missing data. Rather, it refers to the presence of a systematic relationship between the likelihood of missing values (on Y) and other variables (X) in a given dataset. In contrast, MCAR refers to the absence of a relationship between other variables and the probability of missing data on Y, which is closest to haphazardly missing data. Another missing data mechanism, termed missing not at random (MNAR), accounts for situations in which missing Y values are associated with Y values after controlling for the influence of other variables in a dataset (e.g., a disproportionate number of missing job performance ratings for workers who were fired for unacceptably low work performance after controlling for potential confounds, such as IQ) (Enders, 2010, pp. 5-8). Similar to MAR, there is no acknowledgement in the quantitative research literature of any formal evaluation procedure for clearly judging whether missing data are NMAR, and MNAR models (e.g., selection and pattern mixture models) rely on far more restrictive and generally untenable assumptions than MAR models (e.g., distribution assumptions that cannot be tested and controversial parameter estimation practices) (see Allison, 2002;Enders, 2010;Schafer & Graham, 2002). 12 Of note, categorical data imputation was not conducted. 13 These particular independent variables were chosen based on perceived plausibility of accounting for variance in NF responses. For example, it is conceivable that practitioners with different theoretical orientation preferences, professional degrees/concentrations, and years of practice may have evidenced unique NF patterns. 14 Although not as quantitatively sophisticated as multiple imputation and maximum likelihood procedures, standard regression imputation is similar to these procedures in that it gleans information from non-missing data using complete-case analysis to construct regression equations used to impute missing values. It is generally considered superior to mean substitution techniques (e.g., milder covariance attenuation), although correlations and multiple R values tend to be overestimated (Enders, 2010;Olinsky, Chen, & Harlow, 2003). 15 A comparison of results from imputed and non-imputed CAKQ data was omitted because the standard default approach to missing data (viz., the complete case or listwise approach) would have yielded misleading results (i.e., the sample size would have been drastically reduced from 324 to 59 participants, resulting in a substantial loss of statistical power).
possibly mistaken) idea about exposure and response prevention (ExRP) and its effectiveness, and clinician B, who knows precisely what ExRP is (and knows that the research shows it to be efficacious and effective for OCD, especially in conjunction with Luvox), yet administers aromatherapy for OCD regardless. However, if clinician A strongly disagreed with a statement about the effectiveness of ExRP for OCD framed as "Scientific research shows…," then this response would not consist of pure belief at work-it would be undergirded by an ignorance of the relevant literature. If clinician B were to strongly disagree with the same statement, however, this would indicate a more belief-infused denial of the data as opposed to ignorance of the research. 17 A "hard-head, soft-heart" (cf. Blinder, 1987) approach to psychotherapy might draw from "science-informed humanism" (Allen, 2013, spring), which emphasizes both specific (e.g., evidence-based treatment components) and nonspecific (e.g., empathy and therapeutic alliance) factors shown to be effective in treating psychopathology (Allen, 2013, spring;Bracken et al., 2012). This idea is compatible with Peterson's (2000) concept of the scientific practitioner, although his admonition should be kept in mind: "Those who glorify the artistic aspects of clinical experience and resist scientific advances lead us astray" (p. 252).     Note. REI = Rational-Experiential Inventory. Boldface indicates highest factor loadings. REI item content can be found in the Appendix. Factor 1 = Experiential Ability; Factor 2 = Experiential Engagement; h 2 = communality. Note. REI = Rational-Experiential Inventory; CTQ = Critical Thinking Questionnaire.  3. The majority of people who were sexually abused during childhood do not go on to develop severe personality disorders in adulthood. ______ (GK) 4. Empirically supported therapies rarely generalize to real-world settings. ______ (R) (EBT) 5. Psychological research has discredited the idea that human memory works like a video or tape recorder (e.g., that the brain is capable of near-perfect retention of the details of past events 41. If given access to identical information, practitioners with 10 or more years of clinical experience will tend to make considerably more accurate diagnostic judgments than practitioners with only a few years of clinical experience. ______ (R) (CJ) Note. Letters enclosed in parentheses following each item response blank were not printed on survey copies distributed to study participants and indicate the following: CJ = clinical judgment; EBT = evidence-based treatment; GK = general knowledge; PS = pseudoscience; R = reverse-scored item.

PART C
Instructions: Please rate the following statements about your feelings, beliefs, and behaviors as they apply to your clinical practice activities using the 5-point scale below. Again, please write the appropriate number on the line located to the right of each item. Note. Letters enclosed in parentheses following each item response blank were not printed on survey copies distributed to study participants and indicate the following: EA = Experiential Ability; EE = Experiential Engagement; R = reverse-scored items.

PART D
Instructions: Please complete the following questions as best you can by darkening the boxed letter beside the correct answer. Remember that all responses are completely confidential. Please try to complete all questions in one sitting, do not spend too much time on any one question, and do not use help from others or other sources. Further instructions are provided for some of the exercises. Please make sure that all answers are clearly marked.

Exercise 1
Imagine that disorder X occurs in one in every 1,000 people. Imagine also there is a test to diagnose the disorder that always gives a positive result when a person has the disorder. Finally, imagine that the test has a false positive rate of 5 percent. This means that the test wrongly indicates that the disorder is present in 5 percent of the cases where the person does not have the disorder.
1. Imagine that we choose a person randomly, administer the test, and that it yields a positive result (indicates that the person has the disorder). What is the probability that the individual actually has the disorder, assuming that we know nothing else about the individual's psychological or medical history? A <10% B 10-30% C 30-50% D 50-70% E 70-90% F >90%

Exercise 2
The next exercises consist of brief paragraphs followed by several conclusions. For these questions, please assume that everything in the paragraph is true. The problem is to judge the whether or not each of the proposed conclusions logically follows beyond a reasonable doubt from the information given. Please mark either follows or does not follow after the conclusion.
Chris had poor posture, had very few friends, was ill at east around people, and in general was very unhappy. Then, a close friend recommended that Chris visit Dr. Carll, a reputed expert on helping people improve their personalities. Chris took this recommendation and, after three months of therapy with Dr. Carll, developed more friendships, was more at ease, and in general felt happier.

A Follows B Does Not Follow
When I go to bed at night, I usually fall asleep quite promptly. But about twice a month, I drink coffee during the evening, and whenever I do, I lie awake and toss for hours.
4. My problem is mostly psychological; I expect that the coffee will keep me awake, and therefore it does.
A Follows B Does Not Follow 5. On nights when I want to fall asleep promptly, I'd better not drink coffee in the evening.

A Follows B Does Not Follow
When the Journal Company, Inc. was created in 1960, it was the largest psychological journal company America had known up to that time. It produced twice as many psychological journals as all of its domestic competitors put together. Today, the Journal Company, Inc. produces about 20 percent of the psychological journals that are made in this country.
6. In 1960, the Journal Company, Inc. produced not less than 66 percent of the total domestic output of psychological journals.
A Follows B Does Not Follow 7. Today, domestic competitors produce more than three times as many psychological journals as does the Journal Company, Inc.
A Follows B Does Not Follow 8. The Journal Company, Inc. produces fewer psychological journals than it did in 1960.

Exercise 3
In this section, each exercise consists of several statements followed by several suggested conclusions. For the purposes of this study, consider the statements in each exercise as true without exception. After reading the conclusions beneath the statement, please mark whether you think it follows or does not follow from the statement given, regardless of whether you believe the statement to be true or not from your own experience or knowledge.
No person who thinks scientifically places any faith in the predictions of astrologers. Nevertheless, there are many people who rely on horoscopes provided by astrologers. Therefore-9. People who lack confidence in horoscopes think scientifically.
A Follows B Does Not Follow 10. Many people do not think scientifically.

A Follows B Does Not Follow
Most persons who attempt to break their smoking habit find that it is something that they can accomplish only with difficulty, or cannot accomplish at all. Nevertheless, there is a growing number of individuals whose strong desire to stop smoking has enabled them to break the habit permanently. Therefore -11. Only smokers who strongly desire to stop smoking will succeed in doing so.
A Follows B Does Not Follow 12. A strong desire to stop smoking helps some people to permanently break the habit.

A Follows B Does Not Follow
The