University of LiègeULgFaculty of EngineeringFacSALibrary News   
Pierre Geurts - Publications ORBI
Vandaele, R., LALLEMAND, F., MARTINIVE, P., GULYBAN, A., JODOGNE, S., COUCKE, P., Geurts, P., & Marée, R. (in press). Automated multimodal volume registration based on supervised 3D anatomical landmark detection. SCITEPRESS Digital Library.
Peer reviewed
We propose a new method for automatic 3D multimodal registration based on anatomical landmark detection. Landmark detectors are learned independantly in the two imaging modalities using Extremely Randomized ...
Begon, J.-M., Joly, A., & Geurts, P. (2016, September 12). Joint learning and pruning of decision forests. Paper presented at The 25th Belgian-Dutch Conference on Machine Learning (Benelearn), Kortrijk, Belgique.
Peer reviewed
Decision forests such as Random Forests and Extremely randomized trees are state-of-the-art supervised learning methods. Unfortunately, they tend to consume much memory space. In this work, we propose an ...
Marée, R., Rollus, L., Stévens, B., Hoyoux, R., Louppe, G., Vandaele, R., Begon, J.-M., Kainz, P., Geurts, P., & Wehenkel, L. (2016, January 10). Collaborative analysis of multi-gigapixel imaging data using Cytomine. Bioinformatics, 7.
Peer reviewed (verified by ORBi)
Motivation: Collaborative analysis of massive imaging datasets is essential to enable scientific discoveries. Results: We developed Cytomine to foster active and distributed collaboration of ...
Marée, R., Geurts, P., & Wehenkel, L. (2016). Towards Generic Image Classification using Tree-based Learning: an Extensive Empirical Study. Pattern Recognition Letters.
Peer reviewed (verified by ORBi)
This paper considers the general problem of image classification without using any prior knowledge about image classes. We study variants of a method based on supervised learning whose common ...
Freres, P.* , Wenric, S.* , Boukerroucha, M., Fasquelle, C., Thiry, J., Bovy, N., Struman, I., Geurts, P., COLLIGNON, J., SCHROEDER, H., KRIDELKA, F., LIFRANGE, E., Jossa, V., Bours, V., Josse, C., & JERUSALEM, G. (2015, December 29). Circulating microRNA-based screening tool for breast cancer. Oncotarget.
Peer reviewed
Circulating microRNAs (miRNAs) are increasingly recognized as powerful biomarkers in several pathologies, including breast cancer. Here, their plasmatic levels were measured to be used as an alternative ...
* These authors have contributed equally to this work.
Du, W., Liao, Y., Tao, N., Geurts, P., Fu, X., & Leduc, G. (2015). Rating Network Paths for Locality-Aware Overlay Construction and Routing. IEEE/ACM Transactions on Networking, 23(5), 1661-1673.
Peer reviewed (verified by ORBi)
This paper investigates the rating of network paths, i.e. acquiring quantized measures of path properties such as round-trip time and available bandwidth. Comparing to finegrained measurements, coarse ...
Liegeois, R., Ziegler, E., Bahri, M. A., Phillips, C., Geurts, P., Gomez, F., Yeo, T., VANHAUDENHUYSE, A., Soddu, A., LAUREYS, S., & Sepulchre, R. (2015, July). Cerebral functional connectivity periodically (de)synchronizes with anatomical constraints. Brain Structure and Function.
Peer reviewed
This paper studies the link between resting-state functional connectivity (FC), measured by the correlations of the fMRI BOLD time courses, and structural connectivity (SC), estimated through fiber ...
Schrynemackers, M., Wehenkel, L., Madan Babu, M., & Geurts, P. (2015). Classifying pairs with trees for supervised biological network inference. Molecular Biosystems, 11(8), 2116-2125.
Peer reviewed (verified by ORBi)
Networks are ubiquitous in biology, and computational approaches have been largely investigated for their inference. In particular, supervised machine learning methods can be used to complete a partially known ...
Assent, D., Bourgot, I., Hennuy, B., Geurts, P., Noël, A., Foidart, J.-M., & Maquoi, E. (2015). A Membrane-Type-1 Matrix Metalloproteinase (MT1-MMP) - Discoidin Domain Receptor 1 Axis Regulates Collagen-Induced Apoptosis in Breast Cancer Cells. PloS one, 10(3), 0116006.
Peer reviewed (verified by ORBi)
During tumour dissemination, invading breast carcinoma cells become confronted with a reactive stroma, a type I collagen-rich environment endowed with anti-proliferative and pro-apoptotic properties. To ...
Jeanray, N., Marée, R., Pruvot, B., Stern, O., Geurts, P., Wehenkel, L., & Muller, M. (2015). Phenotype Classification of Zebrafish Embryos by Supervised Learning. PLoS ONE, 10(1), 0116989, 1-20.
Peer reviewed (verified by ORBi)
Zebrafish is increasingly used to assess biological properties of chemical substances and thus is becoming a specific tool for toxicological and pharmacological studies. The effects of chemical substances on ...
Marée, R., Geurts, P., & Wehenkel, L. (2014). Towards generic image classification: an extensive empirical study (1). Eprint/Working paper retrieved from http://orbi.ulg.ac.be/handle/2268/175525.
This paper considers the general problem of image classification without using any prior knowledge about image classes. We study variants of a method based on supervised learning whose common ...
Potier, D., Davie, K., Hulselmans, G., Naval Sanchez, M., Haagen, L., Huynh-Thu, V. A., Koldere, D., Celik, A., Geurts, P., Christiaens, V., & Aerts, S. (2014). Mapping Gene Regulatory Networks in Drosophila Eye Development by Large-Scale Transcriptome Perturbations and Motif Inference. Cell Reports, 9(6), 2290-2303.
Peer reviewed (verified by ORBi)
Genome control is operated by transcription factors (TFs) controlling their target genes by binding to promoters and enhancers. Conceptually, the interactions between TFs, their binding sites, and their ...
Joly, A., Geurts, P., & Wehenkel, L. (2014). Random forests with random projections of the output space for high dimensional multi-label classification. Machine Learning and Knowledge Discovery in Databases.
Peer reviewed
We adapt the idea of random projections applied to the out- put space, so as to enhance tree-based ensemble methods in the context of multi-label classification. We show how learning time complexity can be ...
Sutera, A., Joly, A., François-Lavet, V., Qiu, Z., Louppe, G., Ernst, D., & Geurts, P. (2014). Simple connectome inference from partial correlation statistics in calcium imaging. Proceedings of Connectomics 2014 (ECML 2014).
Peer reviewed
In this work, we propose a simple yet effective solution to the problem of connectome inference in calcium imaging data. The proposed algorithm consists of two steps. First, processing the raw signals to ...
Botta, V., Louppe, G., Geurts, P., & Wehenkel, L. (2014, April 02). Exploiting SNP Correlations within Random Forest for Genome-Wide Association Studies. PLoS ONE.
Peer reviewed (verified by ORBi)
The primary goal of genome-wide association studies (GWAS) is to discover variants that could lead, in isolation or in combination, to a particular trait or disease. Standard approaches to GWAS, however, are ...
Azrour, S., Pierard, S., Geurts, P., & Van Droogenbroeck, M. (2014). Data normalization and supervised learning to assess the condition of patients with multiple sclerosis based on gait analysis. European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN) (pp. 649-654).
Peer reviewed
Gait impairment is considered as an important feature of disability in multiple sclerosis but its evaluation in the clinical routine remains limited. In this paper, we assess, by means of supervised learning ...
Ruyssinck, J., Huynh-Thu, V. A., Geurts, P., Dhaene, T., Demeester, P., & Saeys, Y. (2014). NIMEFI: Gene Regulatory Network Inference using Multiple Ensemble Feature Importance Algorithms. PLoS ONE, 9(3), 92709.
Peer reviewed (verified by ORBi)
One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and ...
JOSSE, C., Bouznad, N., Geurts, P., Irrthum, A., Huynh-Thu, V. A., Servais, L., Hego, A., Delvenne, P., Bours, V., & Oury, C. (2014). Identification of a microRNA landscape targeting the PI3K/Akt signaling pathway in inflammation-induced colorectal carcinogenesis. American Journal of Physiology - Gastrointestinal and Liver Physiology, 306, 229-43.
Peer reviewed (verified by ORBi)
Inflammation can contribute to tumor formation; however, markers that predict progression are still lacking. In the present study, the well-established azoxymethane (AOM)/dextran sulfate sodium (DSS)-induced ...
Marchand, G., Huynh-Thu, V. A., Kane, N. C., Arribat, S., Vares, D., Rengel, D., Balzergue, S., Rieseberg, L. H., Vincourt, P., Geurts, P., Vignes, M., & Langlade, N. B. (2014). Bridging physiological and evolutionary time-scales in a gene regulatory network. The New phytologist, 203(2), 685-696.
Peer reviewed (verified by ORBi)
Gene regulatory networks (GRNs) govern phenotypic adaptations and reflect the trade-offs between physiological responses and evolutionary adaptation that act at different time-scales. To identify patterns of ...
Schrynemackers, M., Kuffner, R., & Geurts, P. (2013). On protocols and measures for the validation of supervised methods for the inference of biological networks. Frontiers in genetics, 4(262).
Peer reviewed
Networks provide a natural representation of molecular biology knowledge, in particular to model relationships between biological entities such as genes, proteins, drugs, or diseases. Because of the effort ...
Louppe, G., Wehenkel, L., Sutera, A., & Geurts, P. (2013). Understanding variable importances in forests of randomized trees. Advances in Neural Information Processing Systems 26.
Peer reviewed
Despite growing interest and practical use in various scientific areas, variable importances derived from tree-based ensemble methods are not well understood from a theoretical point of view. In this work we ...
Liao, Y., Du, W., Geurts, P., & Leduc, G. (2013). DMFSGD: A Decentralized Matrix Factorization Algorithm for Network Distance Prediction. IEEE/ACM Transactions on Networking, 21(5), 1511-1524.
Peer reviewed (verified by ORBi)
The knowledge of end-to-end network distances is essential to many Internet applications. As active probing of all pairwise distances is infeasible in large-scale networks, a natural idea is to measure a ...
Mikut, R., Dickmeis, T., Driever, W., Geurts, P., Hamprecht, F. A., Kausler, B. X., Ledesma-Carbayo, M. J., Maree, R., Mikula, K., Pantazis, P., Ronneberger, O., Santos, A., Stotzka, R., Strahle, U., & Peyrieras, N. (2013). Automated Processing of Zebrafish Imaging Data: A Survey. Zebrafish, 10(3), 401-421.
Peer reviewed
Due to the relative transparency of its embryos and larvae, the zebrafish is an ideal model organism for bioimaging approaches in vertebrates. Novel microscope technologies allow the imaging of developmental ...
Marée, R., Wehenkel, L., & Geurts, P. (2013). Extremely Randomized Trees and Random Subwindows for Image Classification, Annotation, and Retrieval. In A., Criminisi & J., Shotton (Eds.), Decision Forests in Computer Vision and Medical Image Analysis, Advances in Computer Vision and Pattern Recognition (pp. 125-142). Springer.
Peer reviewed
We present a unified framework involving the extraction of random subwindows within images and the induction of ensembles of extremely randomized trees. We discuss the specialization of this framework for ...
Huynh-Thu, V. A., Wehenkel, L., & Geurts, P. (2013). Gene regulatory network inference from systems genetics data using tree-based methods. In A., de la Fuente (Ed.), Gene Network Inference - Verification of Methods for Systems Genetics Data (pp. 63-85). Springer.
Peer reviewed
One of the pressing open problems of computational systems biology is the elucidation of the topology of gene regulatory networks (GRNs). In an attempt to solve this problem, the idea of systems genetics is to ...
Du, W., Liao, Y., Geurts, P., & Leduc, G. (2012). Ordinal Rating of Network Performance and Inference by Matrix Completion (arXiv:1211.0447).
This paper addresses the large-scale acquisition of end-to-end network performance. We made two distinct contributions: ordinal rating of network performance and inference by matrix completion. The former ...
Maes, F., Geurts, P., & Wehenkel, L. (2012). Embedding Monte Carlo search of features in tree-based ensemble methods. In P., Flach, T., De Bie, & N., Cristianini (Eds.), Machine Learning and Knowledge Discovery in Data Bases (pp. 191-206). Springer.
Peer reviewed
Feature generation is the problem of automatically constructing good features for a given target learning problem. While most feature generation algorithms belong either to the filter or to the wrapper ...
Schnitzler, F., Geurts, P., & Wehenkel, L. (2012). Mixtures of Bagged Markov Tree Ensembles. In A., Cano Utrera, M., Gómez-Olmedo, & T., Nielsen (Eds.), Proceedings of the 6th European Workshop on Probabilistic Graphical Models (pp. 283-290).
Peer reviewed
Markov trees, a probabilistic graphical model for density estimation, can be expanded in the form of a weighted average of Markov Trees. Learning these mixtures or ensembles from observations can be performed ...
Hiard, S., Geurts, P., & Wehenkel, L. (2012). Comparator selection for RPC with many labels. ECAI 2012 : 20th European Conference on Artificial Intelligence : 27-31 August 2012, Montpellier, France (pp. 408-413). Amsterdam, Netherlands: IOS Press.
Peer reviewed
The Ranking by Pairwise Comparison algorithm (RPC) is a well established label ranking method. However, its complexity is of O(N²) in the number N of labels. We present algorithms for selection, before model ...
Joly, A., Schnitzler, F., Geurts, P., & Wehenkel, L. (2012). L1-based compression of random forest models. Proceeding of the 21st Belgian-Dutch Conference on Machine Learning.
Random forests are effective supervised learning methods applicable to large-scale datasets. However, the space complexity of tree ensembles, in terms of their total number of nodes, is often prohibitive ...
Schnitzler, F., Ammar, S., Leray, P., Geurts, P., & Wehenkel, L. (2012). Approximation efficace de mélanges bootstrap d’arbres de Markov pour l’estimation de densité. In L., Bougrain (Ed.), Actes de la 14e Conférence Francophone sur l'Apprentissage Automatique (CAp 2012) (pp. 207-222).
Peer reviewed
Nous considérons des algorithmes pour apprendre des Mélanges bootstrap d'Arbres de Markov pour l'estimation de densité. Pour les problèmes comportant un grand nombre de variables et peu d'observations, ces ...
Huynh-Thu, V. A., Saeys, Y., Wehenkel, L., & Geurts, P. (2012). Statistical interpretation of machine learning-based feature importance scores for biomarker discovery. Bioinformatics, 28(13), 1766-1774.
Peer reviewed (verified by ORBi)
Motivation: Univariate statistical tests are widely used for biomarker discovery in bioinformatics. These procedures are simple, fast and their output is easily interpretable by biologists but they can only ...
Joly, A., Schnitzler, F., Geurts, P., & Wehenkel, L. (2012). L1-based compression of random forest models. 20th European Symposium on Artificial Neural Networks.
Peer reviewed
Random forests are effective supervised learning methods applicable to large-scale datasets. However, the space complexity of tree ensembles, in terms of their total number of nodes, is often prohibitive ...
Liao, Y., Du, W., Geurts, P., & Leduc, G. (2012). DMFSGD: A Decentralized Matrix Factorization Algorithm for Network Distance Prediction (arXiv:1201.1174).
The knowledge of end-to-end network distances is essential to many Internet applications. As active probing of all pairwise distances is infeasible in large-scale networks, a natural idea is to measure a few ...
Louppe, G., & Geurts, P. (2012). Ensembles on Random Patches. Machine Learning and Knowledge Discovery in Databases. Berlin: Springer-Verlag.
Peer reviewed
In this paper, we consider supervised learning under the assumption that the available memory is small compared to the dataset size. This general framework is relevant in the context of big data, distributed ...
Liao, Y., Du, W., Geurts, P., & Leduc, G. (2011). Decentralized Prediction of End-to-End Network Performance Classes. Proc. of the 7th International Conference on emerging Networking EXperiments and Technologies (CoNEXT). ACM.
Peer reviewed
In large-scale networks, full-mesh active probing of end-to-end performance metrics is infeasible. Measuring a small set of pairs and predicting the others is more scalable. Under this framework, we formulate ...
Joly, A., Schnitzler, F., Geurts, P., & Wehenkel, L. (2011, November 29). Pruning randomized trees with L1-norm regularization. Poster session presented at DYSCO Study Day, Leuven-Heverlee, Belgium.
Growing amount of high dimensional data requires robust analysis techniques. Tree-based ensemble methods provide such accurate supervised learning models. However, the model complexity can become utterly huge ...
Levels, J. H., Geurts, P., Karlsson, H., Marée, R., Ljunggren, S., Fornander, L., Wehenkel, L., Lindahl, M., Stroes, E. S., Kuivenhoven, J. A., & Meijers, J. C. (2011). High-density lipoprotein proteome dynamics in human endotoxemia. Proteome science, 9(1), 34.
Peer reviewed (verified by ORBi)
BACKGROUND: A large variety of proteins involved in inflammation, coagulation, lipid-oxidation and lipid metabolism have been associated with high-density lipoprotein (HDL) and it is anticipated that changes ...
Stern, O., Marée, R., Aceto, J., Jeanray, N., Muller, M., Wehenkel, L., & Geurts, P. (2011, May 20). Zebrafish Skeleton Measurements using Image Analysis and Machine Learning Methods. Poster session presented at Belgian Dutch Conference on Machine learning (Benelearn).
Peer reviewed
The zebrafish is a model organism for biological studies on development and gene function. Our work aims at automating the detection of the cartilage skeleton and measuring several distances and angles to ...
Geurts, P. (2011). Learning from positive and unlabeled examples by enforcing statistical significance. JMLR: Workshop and Conference Proceedings, 15.
Peer reviewed
Given a finite but large set of objects de- scribed by a vector of features, only a small subset of which have been labeled as ‘positive’ with respect to a class of interest, we consider the problem of ...
Schnitzler, F., Geurts, P., & Wehenkel, L. (2011, March 21). Looking for applications of mixtures of Markov trees in bioinformatics. Paper presented at BioMAGNet Annual Meeting 2011, Bruxelles, Belgium.
Probabilistic graphical models (PGM) efficiently encode a probability distribution on a large set of variables. While they have already had several successful applications in biology, their poor scaling in ...
Geurts, P., & Louppe, G. (2011). Learning to rank with extremely randomized trees. JMLR: Workshop and Conference Proceedings, 14, 49-61.
Peer reviewed
In this paper, we report on our experiments on the Yahoo! Labs Learning to Rank challenge organized in the context of the 23rd International Conference of Machine Learning (ICML 2010). We competed in both the ...
Stern, O., Marée, R., Aceto, J., Jeanray, N., Muller, M., Wehenkel, L., & Geurts, P. (2011). Automatic localization of interest points in zebrafish images with tree-based methods. Proceedings of the 6th IAPR International Conference on Pattern Recognition in Bioinformatics. Springer.
Peer reviewed
In many biological studies, scientists assess effects of experimental conditions by visual inspection of microscopy images. They are able to observe whether a protein is expressed or not, if cells are going ...
Louppe, G., & Geurts, P. (2010, December 11). A zealous parallel gradient descent algorithm. Poster session presented at NIPS 2010 Workshop on Learning on Cores, Clusters and Clouds, Whistler, Canada.
Peer reviewed
Parallel and distributed algorithms have become a necessity in modern machine learning tasks. In this work, we focus on parallel asynchronous gradient descent and propose a zealous variant that minimizes the ...
Huynh-Thu, V. A., Irrthum, A., Wehenkel, L., & Geurts, P. (2010). Inferring Regulatory Networks from Expression Data Using Tree-Based Methods. PLoS ONE, 5(9), 12776.
Peer reviewed (verified by ORBi)
One of the pressing open problems of computational systems biology is the elucidation of the topology of genetic regulatory networks (GRNs) using high throughput genomic data, in particular microarray gene ...
Liao, Y., Geurts, P., & Leduc, G. (2010). Network Distance Prediction Based on Decentralized Matrix Factorization. Lecture Notes in Computer Science, 6091, 15-26.
Peer reviewed
Network Coordinate Systems (NCS) are promising techniques to predict unknown network distances from a limited number of measurements. Most NCS algorithms are based on metric space embedding and suffer from ...
Marée, R., Denis, P., Wehenkel, L., & Geurts, P. (2010). Incremental Indexing and Distributed Image Search using Shared Randomized Vocabularies. ACM Proceedings MIR 2010.
Peer reviewed
We present a cooperative framework for content-based image retrieval for the realistic setting where images are distributed across multiple cooperating servers. The proposed method is in line ...
El Khayat, I., Geurts, P., & Leduc, G. (2010). Enhancement of TCP over wired/wireless networks with packet loss classifiers inferred by supervised learning. Wireless Networks, 16(2), 273-290.
Peer reviewed (verified by ORBi)
TCP is suboptimal in heterogeneous wired/wireless networks because it reacts in the same way to losses due to congestion and losses due to link errors. In this paper, we propose to improve TCP performance in ...
De Lobel, L., Geurts, P., Baele, G., Castro-Giner, F., Kogevinas, M., & Van Steen, K. (2010). A screening methodology based on Random Forests to improve the detection of gene-gene interactions. European Journal of Human Genetics, 18(1127), 1132.
Peer reviewed (verified by ORBi)
The search for susceptibility loci in gene-gene interactions imposes a methodological and computational challenge for statisticians because of the large dimensionality inherent to the modelling of gene-gene ...
Marée, R., Stern, O., & Geurts, P. (2010). Biomedical Imaging Modality Classification Using Bags of Visual and Textual Terms with Extremely Randomized Trees: Report of ImageCLEF 2010 Experiments. CLEF Notebook Papers/LABs/Workshops.
In this paper we describe our experiments related to the ImageCLEF 2010 medical modality classification task using extremely randomized trees. Our best run combines bags of textual and visual features ...
Cornélusse, B., Geurts, P., & Wehenkel, L. (2009, December 12). Tree based ensemble models regularization by convex optimization. Paper presented at NIPS-09 workshop on Optimization for Machine Learning, Whistler, Canada.
Peer reviewed
Tree based ensemble methods can be seen as a way to learn a kernel from a sample of input-output pairs. This paper proposes a regularization framework to incorporate non-standard information not used in the ...
Geurts, P., Irrthum, A., & Wehenkel, L. (2009). Supervised learning with decision tree-based methods in computational and systems biology. Molecular Biosystems, 5(12), 1593-1605.
Peer reviewed (verified by ORBi)
At the intersection between artificial intelligence and statistics, supervised learning provides algorithms to automatically build predictive models only from observations of a system. During the last ...
Liao, Y., Kaafar, M. A., Gueye, B., Cantin, F., Geurts, P., & Leduc, G. (2009). Detecting Triangle Inequality Violations in Internet Coordinate Systems by Supervised Learning. Lecture Notes in Computer Science, 5550, 352-363.
Peer reviewed
Internet Coordinates Systems (ICS) are used to predict Internet distances with limited measurements. However the precision of an ICS is degraded by the presence of Triangle Inequality Violations (TIVs). Simple ...
Dumont, M., Marée, R., Wehenkel, L., & Geurts, P. (2009). Fast Multi-Class Image Annotation with Random Subwindows and Multiple Output Randomized Trees. Proc. International Conference on Computer Vision Theory and Applications (VISAPP) (pp. 196-203).
Peer reviewed
This paper addresses image annotation, i.e. labelling pixels of an image with a class among a finite set of predefined classes. We propose a new method which extracts a sample of subwindows from a set of ...
Marée, R., Geurts, P., & Wehenkel, L. (2009). Content-based Image Retrieval by Indexing Random Subwindows with Randomized Trees. IPSJ Transactions on Computer Vision and Applications, 1.
Peer reviewed
We propose a new method for content-based image retrieval which exploits the similarity measure and indexing structure of totally randomized tree ensembles induced from a set of subwindows randomly extracted ...
De Seny, D., Ribbens, C., Cobraiville, G., Meuwis, M.-A., Marée, R., Geurts, P., Wehenkel, L., Louis, E., Merville, M.-P., Fillet, M., & Malaise, M. (2009). Protéomique par SELDI-TOF-MS des maladies inflammatoires articulaires: identification des protéines S100 comme protéines d'intérêt. Revue Médicale de Liège, 64(Spec No), 29-35.
Peer reviewed (verified by ORBi)
Clinical proteomics is a technical approach studying the entire proteome expressed by cells, tissues or organs. It describes the dynamics of cell regulation by detecting molecular events related ...
Marée, R., Stevens, B., Geurts, P., Guern, Y., & Mack, P. (2009). A Machine Learning Approach for Material Detection in Hyperspectral Images. Proc. 6th IEEE Workshop on Object Tracking and Classification Beyond and in the Visible Spectrum (OTCBVS-CVPR09). IEEE.
Peer reviewed
In this paper we propose a machine learning approach for the detection of gaseous traces in thermal infra red hyperspectral images. It exploits both spectral and spatial information by extracting subcubes ...
Botta, V., Hansoul, S., Geurts, P., & Wehenkel, L. (2008). Raw genotypes vs haplotype blocks for genome wide association studies by random forests. Proc. of MLSB 2008, second workshop on Machine Learning in Systems Biology.
Peer reviewed
We consider two different representations of the input data for genome-wide association studies using random forests, namely raw genotypes described by a few thousand to a few hundred thousand discrete ...
Meuwis, M.-A., Fillet, M., Lutteri, L., Marée, R., Geurts, P., De Seny, D., Malaise, M., Chapelle, J.-P., Wehenkel, L., Belaiche, J., Merville, M.-P., & Louis, E. (2008). Proteomics for prediction and characterization of response to infliximab in Crohn's disease: a pilot study. Clinical Biochemistry, 41(12), 960-7.
Peer reviewed (verified by ORBi)
OBJECTIVES: Infliximab is the first anti-TNFalpha accepted by the Food and Drug Administration for use in inflammatory bowel disease treatment. Few clinical, biological and genetic factors tend to predict ...
Del Angel, A., Geurts, P., Ernst, D., Glavic, M., & Wehenkel, L. (2007). Estimation of rotor angles of synchronous machines using artificial neural networks and local PMU-based quantities. Neurocomputing, 70(16-18), 2668-2678.
Peer reviewed (verified by ORBi)
This paper investigates a possibility for estimating rotor angles in the time frame of transient (angle) stability of electric power systems, for use in real-time. The proposed dynamic state estimation ...
Huynh-Thu, V. A., Hiard, S., Geurts, P., Muller, M., Struman, I., Martial, J., & Wehenkel, L. (2007, September). Detection of micro-RNA/gene interactions involved in angiogenesis using machine learning techniques. Poster session presented at Workshop on Machine Learning in Systems Biology (MLSB07), Evry, France.
Motivation: Angiogenesis is the process responsible for the growth of new blood vessels from existing ones. It is also associated with the development of cancer, as tumors need to be irrigated by blood vessels ...
El Khayat, I., Geurts, P., & Leduc, G. (2007). Machine-learnt versus analytical models of TCP throughput. Computer Networks, 51(10), 2631-2644.
Peer reviewed (verified by ORBi)
We first study the accuracy of two well-known analytical models of the average throughput of long-term TCP flows, namely the so-called SQRT and PFTK models, and show that these models are far from being ...
Geurts, P., Touleimat, N., Dutreix, M., & d'Alche-Buc, F. (2007). Inferring biological networks with output kernel trees. BMC Bioinformatics, 8(Suppl. 2), 4.
Peer reviewed (verified by ORBi)
Background: Elucidating biological networks between proteins appears nowadays as one of the most important challenges in systems biology. Computational approaches to this problem are important to complement ...
Marée, R., Geurts, P., & Wehenkel, L. (2007). Content-based Image Retrieval by Indexing Random Subwindows with Randomized Trees. Proc. 8th Asian Conference on Computer Vision (ACCV), LNCS (pp. 611-620). Springer-Verlag.
Peer reviewed
We propose a new method for content-based image retrieval which exploits the similarity measure and indexing structure of totally randomized tree ensembles induced from a set of subwindows randomly ...
Marée, R., Geurts, P., & Wehenkel, L. (2007). Random subwindows and extremely randomized trees for image classification in cell biology. BMC Cell Biology, 8(Suppl. 1).
Peer reviewed (verified by ORBi)
Background: With the improvements in biosensors and high-throughput image acquisition technologies, life science laboratories are able to perform an increasing number of experiments that involve the generation ...
Meuwis, M.-A., Fillet, M., Geurts, P., De Seny, D., Lutteri, L., Chapelle, J.-P., Bours, V., Wehenkel, L., Belaiche, J., Malaise, M., Louis, E., & Merville, M.-P. (2007). Biomarker discovery for inflammatory bowel disease, using proteomic serum profiling. Biochemical Pharmacology, 73(9), 1422-1433.
Peer reviewed (verified by ORBi)
Crohn's disease and ulcerative colitis known as inflammatory bowel diseases (IBD) are chronic immuno-inflammatory pathologies of the gastrointestinal tract. These diseases are multifactorial, polygenic and of ...
El Khayat, I., Geurts, P., & Leduc, G. (2006). On the accuracy of analytical models of TCP throughput. Lecture Notes in Computer Science, 3976, 488-500.
Peer reviewed
Based on a large set of TCP sessions we first study the accuracy of two well-known analytical models (SQRT and PFTK) of the TCP average rate. This study shows that these models are far from being accurate on ...
Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely randomized trees. Machine Learning, 63(1), 3-42.
Peer reviewed (verified by ORBi)
This paper proposes anew tree-based ensemble method for supervised classification and regression problems. It essentially consists of randomizing strongly both attribute and cut-point choice while splitting a ...
Geurts, P., Marée, R., & Wehenkel, L. (2006). Segment and combine: a generic approach for supervised learning of invariant classifiers from topologically structured data. Proceedings of the Machine Learning Conference of Belgium and The Netherlands (Benelearn) (pp. 15-23).
Peer reviewed
A generic method for supervised classification of structured objects is presented. The approach induces a classifier by (i) deriving a surrogate dataset from a pre-classified dataset of structured objects ...
Geurts, P., Wehenkel, L., & d Alché-Buc, F. (2006). Kernelizing the output of tree-based methods. Proceedings of the 23rd International Conference on Machine Learning (pp. 345--352). Acm.
Peer reviewed
We extend tree-based methods to the prediction of structured outputs using a kernelization of the algorithm that allows one to grow trees as soon as a kernel can be defined on the output space. The resulting ...
Marée, R., Geurts, P., & Wehenkel, L. (2006). Biological Image Classification with Random Subwindows and Extra-Trees. Paper presented at Bio-Image Informatics (Workshop on Multiscale Biological Imaging, Data Mining & Informatics), Santa Barbara, USA.
Peer reviewed
We illustrate the potential of our image classification method on three datasets of images at different imaging modalities/scales, from subcellular locations up to human body regions. The method is based on ...
Quach, M., Geurts, P., & d Alché-Buc, F. (2006). Elucidating the structure of genetic regulatory networks: a study of a second order dynamical model on artificial data. Proc. of the 14th European Symposium on Artificial Neural Networks.
Peer reviewed
Learning regulatory networks from time-series of gene expres- sion is a challenging task. We propose to use synthetic data to analyze the ability of a state-space model to retrieve the network structure ...
Wehenkel, L., Ernst, D., & Geurts, P. (2006). Ensembles of extremely randomized trees and some generic applications. Proceedings of Robust Methods for Power System State Estimation and Load Forecasting.
Peer reviewed
In this paper we present a new tree-based ensemble method called “Extra-Trees”. This algorithm averages predictions of trees obtained by partitioning the inputspace with randomly generated splits, leading to ...
Wehenkel, L., Glavic, M., Geurts, P., & Ernst, D. (2006). About automatic learning for advanced sensing, monitoring and control of electric power systems. Proceedings of the Second Carnegie Mellon Conference in Electric Power Systems: Monitoring, Sensing, Software and its Valuation for the Changing electric Power Industry.
Peer reviewed
The paper considers the possible uses of automatic learning for improving power system performance by software methodologies. Automatic learning per se is first reviewed and recent developements of the field ...
Wehenkel, L., Glavic, M., Geurts, P., & Ernst, D. (2006). Automatic learning of sequential decision strategies for dynamic security assessment and control. Proceedings of the IEEE Power Engineering Society General Meeting 2006.
Peer reviewed
This paper proposes to formulate security control as a sequential decision making problem and presents new developments in automatic learning of sequential decision making strategies from simulations and/or ...
El Khayat, I., Geurts, P., & Leduc, G. (2005). Improving TCP in wireless networks with an adaptive machine-learnt classifier of packet loss causes. Lecture Notes in Computer Science, 3462, 549-560.
Peer reviewed
TCP understands all packet losses as buffer overflows and reacts to such congestions by reducing its rate. In hybrid wired/wireless networks where a non negligible number of packet losses are due to link ...
Ernst, D., Geurts, P., & Wehenkel, L. (2005). Tree-based batch mode reinforcement learning. Journal of Machine Learning Research, 6, 503-556.
Peer reviewed (verified by ORBi)
Reinforcement learning aims to determine an optimal control policy from interaction with a system or from observations gathered from a system. In batch mode, it can be achieved by approximating the so-called Q ...
De Seny, D., Fillet, M., Meuwis, M.-A., Geurts, P., Lutteri, L., Ribbens, C., Bours, V., Wehenkel, L., Piette, J., Malaise, M., & Merville, M.-P. (2005). Discovery of new rheumatoid arthritis biomarkers using the surface-enhanced laser desorption/ionization time-of-flight mass spectrometry ProteinChip approach. Arthritis and Rheumatism, 52(12), 3801-12.
Peer reviewed (verified by ORBi)
OBJECTIVE: To identify serum protein biomarkers specific for rheumatoid arthritis (RA), using surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF-MS) technology. METHODS: A ...
Ernst, D., Glavic, M., Geurts, P., & Wehenkel, L. (2005). Approximate value iteration in the reinforcement learning context. Application to electrical power system control. International Journal of Emerging Electrical Power Systems, 3(1).
Peer reviewed (verified by ORBi)
In this paper we explain how to design intelligent agents able to process the information acquired from interaction with a system to learn a good control policy and show how the methodology can be applied to ...
Geurts, P. (2005). Bias vs. variance decomposition for regression and classification. In O., Maimon & L., Rokach (Eds.), Data Mining and Knowledge Discovery Handbook: A Complete Guide for Practitioners and Researchers. Kluwer Academic Publishers.
In this chapter, the important concepts of bias and variance are introduced. After an intuitive introduction to the bias/variance tradeoff, we discuss the bias/variance decompositions of the mean square error ...
Geurts, P., Blanco Cuesta, A., & Wehenkel, L. (2005). Segment and combine approach for Biological Sequence Classification. Proc. IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB 2005) (pp. 194--201).
Peer reviewed
This paper presents a new algorithm based on the segment and combine paradigm, for automatic classification of biological sequences. It classifies sequences by aggregating the information about their ...
Geurts, P., Fillet, M., De Seny, D., Meuwis, M.-A., Malaise, M., Merville, M.-P., & Wehenkel, L. (2005). Proteomic mass spectra classification using decision tree based ensemble methods. Bioinformatics, 21(14), 3138-45.
Peer reviewed (verified by ORBi)
MOTIVATION: Modern mass spectrometry allows the determination of proteomic fingerprints of body fluids like serum, saliva or urine. These measurements can be used in many medical applications in order to ...
Geurts, P., & Wehenkel, L. (2005). Closed-form dual perturb and combine for tree-based models. Proceedings of the International Conference on Machine Learning (ICML 2005).
Peer reviewed
This paper studies the aggregation of predictions made by tree-based models for several perturbed versions of the attribute vector of a test case. A closed-form approximation of this scheme combined with cross ...
Geurts, P., & Wehenkel, L. (2005). Segment and combine approach for non-parametric time-series classification. Lecture Notes in Computer Science, 3721, 478-485.
Peer reviewed
This paper presents a novel, generic, scalable, autonomous, and flexible supervised learning algorithm for the classification of multivariate and variable length time series. The essential ingredients of the ...
Marée, R., Geurts, P., Piater, J., & Wehenkel, L. (2005). Biomedical image classification with random subwindows and decision trees. Computer Vision for Biomedical Image Applications (pp. 220-229). Berlin: Springer-Verlag Berlin.
Peer reviewed
In this paper, we address a problem of biomedical image classification that involves the automatic classification of x-ray images in 57 predefined classes with large intra-class variability. To achieve that ...
Marée, R., Geurts, P., Piater, J., & Wehenkel, L. (2005). Decision Trees and Random Subwindows for Object Recognition. ICML workshop on Machine Learning Techniques for Processing Multimedia Content (MLMM2005).
Peer reviewed
In this paper, we compare five tree-based machine learning methods within a recent generic image classification framework based on random extraction and classification of subwindows. We evaluate them on three ...
Marée, R., Geurts, P., Piater, J., Wehenkel, L., Schmid, C. (Ed.), Soatto, S. (Ed.), & Tomasi, C. (Ed.). (2005). Random Subwindows for Robust Image Classification. Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR 2005) (pp. 34--40).
Peer reviewed
We present a novel, generic image classification method based on a recent machine learning algorithm (ensembles of extremely randomized decision trees). Images are classified using randomly extracted ...
Geurts, P., El Khayat, I., & Leduc, G. (2004, November). A Machine Learning Approach to Improve Congestion Control over Wireless Computer Networks. USA: IEEE.
Peer reviewed
In this paper, we present the application of machine learning techniques to the improvement of the congestion control of TCP in wired/wireless networks. TCP is suboptimal in hybrid wired/wireless networks ...
Marée, R., Geurts, P., Piater, J., Wehenkel, L., Hong, K.-S. (Ed.), & Zhang, Z. (Ed.). (2004). A generic approach for image classification based on decision tree ensembles and local sub-windows. Proceedings of the 6th Asian Conference on Computer Vision (pp. 860-865). Asian Federation of Computer Vision Societies (AFCV).
Peer reviewed
A novel and generic approach for image classification is presented. The method operates directly on pixel values and does not require feature extraction. It combines a simple local sub-window extraction ...
Ernst, D., Geurts, P., & Wehenkel, L. (2003). Iteratively extending time horizon reinforcement learning. Machine Learning: ECML 2003, 14th European Conference on Machine Learning (pp. 96-107). Berlin: Springer-Verlag Berlin.
Peer reviewed
Reinforcement learning aims to determine an (infinite time horizon) optimal control policy from interaction with a system. It can be solved by approximating the so-called Q-function from a sample of four ...
Geurts, P. (2003). Traitement de données volumineuses par ensemble d'arbres aléatoires. Revue des nouvelles technologies de l'information, Numéro spécial entreposage et fouille de données, 1, 111-122.
Peer reviewed
Cet article présente une nouvelle méthode d'apprentissage ba-sée sur un ensemble d'arbres de décision. Par opposition à la méthode traditionnelle d'induction, les arbres de l'ensemble sont construits ...
Marée, R., Geurts, P., & Wehenkel, L. (2003). Une méthode générique pour la classification automatique d'images à partir des pixels. Revue des Nouvelles Technologies de l'Information, 1, 227-238.
Peer reviewed
Dans cet article, nous évaluons une approche générique de classification automatique d'images. Elle repose sur une méthode d'apprentissage récente qui construit des ensembles d'arbres de décision par sélection ...
Geurts, P. (2002). Contributions to decision tree induction: bias/variance tradeoff and time series classification. Unpublished doctoral thesis, University of Liège Belgium.
Because of the rapid progress of computer and information technology, large amounts of data are nowadays available in a lot of domains. Automatic learning aims at developing algorithms able to produce ...
Geurts, P. (2001). Dual Perturb and Combine Algorithm. Proceedings of AISTATS 2001, Eighth International Workshop on Artificial Intelligence and Statistics (pp. 196-201). Key-West, Florida.
Peer reviewed
In this paper, a dual perturb and combine algorithm is proposed which consists in producing the perturbed predictions at the prediction stage using only one model. To this end, the attribute vector of a test ...
Geurts, P. (2001). Pattern extraction for time-series classification. Proceedings of PKDD 2001, 5th European Conference on Principles of Data Mining and Knowledge Discovery (pp. 115-127). Freiburg: Springer-Verlag.
Peer reviewed
In this paper, we propose some new tools to allow machine learning classifiers to cope with time series data. We first argue that many time-series classification problems can be solved by detecting and ...
Geurts, P., Olaru, C., & Wehenkel, L. (2001). Improving the bias/variance tradeoff of decision trees - towards soft tree induction. Engineering intelligent systems, 9, 195-204.
Peer reviewed
One of the main difficulties with standard top down induction of decision trees comes from the high variance of these methods. High variance means that, for a given problem and sample size, the resulting tree ...
Geurts, P. (2000). Some enhancements of decision tree bagging. Proceedings of PKDD 2000, 4th European Conference on Principles of Data Mining and Knowledge Discovery (pp. 136-147). Lyon, France: Springer-Verlag.
Peer reviewed
This paper investigates enhancements of decision tree bagging which mainly aims at improving computation times, but also accuracy. The three questions which are reconsidered are: discretization of continuous ...
Geurts, P., & Wehenkel, L. (2000). Investigation and reduction of discretization Variance in decision tree induction. Proceedings of ECML 2000, European Conference on Machine Learning (pp. 162-170). Springer-Verlag.
Peer reviewed
This paper focuses on the variance introduced by the discretization techniques used to handle continuous attributes in decision tree induction. Different discretization procedures are first studied empirically ...
Geurts, P., & Wehenkel, L. (2000). Temporal machine learning for switching control. Proceedings of PKDD 2000, 4th European Conference on Principles of Data Mining and Knowledge Discovery (pp. 401-408). Lyon, France: Springer-Verlag.
Peer reviewed
In this paper, a temporal machine learning method is presented which is able to automatically construct rules allowing to detect as soon as possible an event using past and present measurements made on a ...
Olaru, C., Geurts, P., & Wehenkel, L. (1999). Data mining tools and application in power system engineering. Proceedings of the 13th Power System Computation Conference, PSCC99 (pp. 324-330). Trondheim, Norway.
Peer reviewed
The power system field is presently facing an explosive growth of data. The data mining (DM) approach provides tools for making explicit some implicit subtle structure in data. Applying data mining to power ...
Geurts, P., & Wehenkel, L. (1998). Early prediction of electric power system blackouts by temporal machine learning. Proceedings of ICML-AAAI 98 Workshop on "Predicting the future: AI approaches to time series analysis" (pp. 21-27). Madison (Wisconsin).
Peer reviewed
This paper discusses the application of machine learning to the design of power system blackout prediction criteria, using a large database of random power system scenarios generated by Monte-Carlo simulation ...