Abstract. One of the core issues for the efficient management of workflow applications is the prediction of tasks performance. This paper proposes a novel approach that enables the construction models for predicting task’s running-times of data-intensive scientific workflows. Ensemble Machine Learning techniques are used to produce robust combined models of high predictive accuracy. Information provided by workflow systems (e.g. benchmarks) and the attributes and provenance of the data, are exploited to guarantee the accuracy of the models. The proposed approach has been tested on bioinformatic workflows for gene expressions analysis over homogeneous and heterogeneous computing environments. Obtained results highlight the convenience of using ensemble models in comparison with single/standalone prediction models. Ensemble learning techniques permitted reductions of the prediction error up to 14.8% (homogeneous) and 8.7% (heterogeneous) in comparison with single-model strategies.