Some skeptics have not really read the whole literature and considered it carefully. Let's put these aside -- they are easy to detect and there is little to say about such criticism. What you then have left behind are two types. One are honest critics. These are very valuable and helpful people because they can light the way for additional research and development. The differences with honest critics so far seem to be in these areas.
Breadth of the criteria. The ACT / RFT community gives more weight to a model that is working than to RCTs alone. Mediational analyses, RFT progress, AAQ studies, component studies, experimental psychopathology, the like all weigh in very heavily. It is absolutely fair to let RCTs be the ultimate arbiter but if you pick them off one at a time with methodological worries, and focus only on DSM syndromes one at a time, even at ~120 RCTs you can see less support than people within the ACT / RFT community might believe is there. Over time, however, if the ACT / RFT community does its job, even that problem will be self-correcting because the development path being following includes randomized controlled trials as a centrally important area -- just not the only area.
The temporal measure of progress. Given the larger purpose of ACT / RFT, this harder set of criteria needs to be considered in terms of how hard the actual task is. The ACT / RFT community wants to be held to a high (amazingly high) standard, but this also means that judgments about accomplishment of such goals have to be made in the context of that stated purpose. This does mean that there is a certain prolonged sense of ambiguity. ACT / RFT research is more than 30 years old and critics can still doubt whether we are actually producing a more progressive psychology. That is fair, but then by the time the ACT / RFT community meets its goals to everyone's satisfaction, most folks in psychology and the behavioral sciences will know it, because these goals are so darned lofty.
Breadth of application. The ACT / RFT community think that the breadth of the model really matters, because the model itself claims to be about a deeper understanding of human cognition. In traditional syndromal treatment studies, the models are often quite narrow and breadth of application is not a fair test so when these folks look at ACT / RFT they don't quite know what to say. APA says we are over the bar only in chronic pain in terms of strong empirical support. Across the board the progress is more notable and the breadth of application is already pretty amazing but only now are good researchers in specific areas doing a deep dive -- modifying protocols and chasing process of change evidence. If the program succeeds there will be multiple studies of ACT / RFT applications within specific areas. Right now there are about 10 areas with at least 5 outcome studies, and nearly 20 areas with more that two (see the new book on "The ACT Research Journey" by Hooper and Larsson).
RCTs versus controlled time series designs. ACT comes from behavior analysis. If you eliminate time series designs in favor of only RCTs, the outcome data weaken, evne with nearly 120 RCTs and new RCTs now appearing every 2 weeks on average.
Quality of studies. Many ACT studies are put together by students and young faculty. Quite a number or from the developing world. Only a about 15 RCTs right now are funded. These early studies are often underpowered and the methodological bells and whistles are sometimes not there. Accord to a careful review by A-Tjak et al (in Psychotherapy and Psychosomatics, 2014) this is getting better, and we are starting to see replications with better controls. When you compare ACT to established CBT research from the best labs in the world, you are comparing research programs at two very different stages of development. We shall see what happens over time as funded ACT research becomes more common. However, average quality is a poor measure. A small study from, say, Iran is a wonderful thing to see ... how does any weaknesses pull down the more than 15 studies on ACT published in the Journal of Consulting and Clinical Psychology? You need to look to see if there are enough well done studies. You need to consider small studies that have specific weaknesses by looking at the whole literature. Knocking studies over one at a time means sturies with any flaws contribute nothing. That is, well, stupid. Average ratings means if you get students, or people without funding, or the developing world excited then you are necessarily weaker. That too is, well, stupid. Look at the whole literature and be responsible.
Published versus coming. ACT / RFT advocates often know about the data that are coming. We may know the researchers involved and feel that we can make some judgments. If you just look at publications (which an outside critic simply must do in order to be responsible) the picture looks different from the outside than it does from the inside. ACT / RFT research has been going on for 30 years, but it has only been visible for a few years, with the publication of the 1999 book on ACT and the 2001 book on RFT. About 75% of the outcome research is in the last 3 years. If the program is truly progressive, these differences will narrow over time however. If you want to get the recent meta-analyses and are a member go to the publications area. If you are not, then click here to join!
The reviews by Ost. Lar-Goran has been invited to ACT conferences many times to criticize the work. Some of his criticisms have been very useful. Others come from a different research tradition and don't have much appeal (e.g., the demand to focus only on syndromes; the insistance that only syndromal measure matter in outcomes). The other problem is that his two reviews have data problems. The first one (in 2008) attributed differences in methodological quality between ACT and CBT to the sloppiness of ACT studies. In fact, Brandon Gaudiano showed (see Gaudiano, 2009) that if you more collect data on funding the differences more carefully you see that all of these differences were due to funding. The study by Ost (in 2014) used ratings of methodological quality have not been replicated by others using the same scale (see the A-Tjak study; when Ost's rating were compared to this study on the same studies, Ost's were more against ACT and had an unacceptably low kappa of .35). The scale itself has several problems in my humble opinion but the A-Tjak study used a team approach for ratings that included ACT critics and ACT researchers, instead of using a single ACT critic and student raters. There were also over 80 factual or interpretive errors made in reading the ACT literature, which weaken the conclusions. A response article has been submitted to BRAT.
Bottom line. We think declaring that something is "evidence based" is a communitarian effort that should have the same open and agreed to standards that are carefully applied to all methods. APA has that ... and ACT is listed as evidence based in several areas. SAMHSA has that and ACT is listed in several areas there too. In the ACBS community we prefer to learn from our critics and keep on pursuing our vision and trying to get better.