Projects

Multi-omic data integration

Integrate multi-omic cancer data representing gene expression, methylation, copy number alterations, somatic mutations, and microRNA expression

RNA-seq co-expression

Identify clusters of co-expressed genes from RNA-seq data

Network inference

Identify gene regulatory networks from transcriptomic data

Recent & Upcoming Talks

Exploring drivers of gene expression in The Cancer Genome Atlas
Dec 4, 2018 3:30 PM
Co-expression analyses of RNA-seq data in practice with the R/Bioconductor package coseq
Jun 22, 2018 4:15 PM
Exploring drivers of gene expression in The Cancer Genome Atlas
Mar 28, 2018 8:30 AM

Publications

Book

  • Albert, I., Ancelet, S., David, O., Denis, J.-B., Makowski, D., Parent, É, Rau, A., and Soubeyrand, S. (2015). Initiation à la statistique bayésienne : Bases théoriques et applications en alimentation, environnmenet, épidémiologie et génétique. Éditions Ellipses, collection références sciences. Code Publisher link ProdInra

Dissertations

  • Rau, A. (2017) Statistical methods and software for the analysis of transcriptomic data. HDR (Habilitation à diriger des recherches) thesis, Université d’Évry Val-d’Essonne. PDF Slides ProdInra
    Note: HDR is a high-level (post-PhD) degree granted by French universities that provides an accreditation to supervise research.

  • Rau, A. (2010) Reverse engineering gene networks using genomic time-course data. PhD thesis, Purdue University. PDF ProdInra

Journal Articles

2018

  • Plasterer, C., Tsaih, S.-W., Lemke, A., Schilling, R., Dwinell, M., Rau, A., Auer, P., Rui, H., Flister, M.J. (2018) Identification of a rat mammary tumor risk locus that is syntenic with the commonly amplified 8q12.1 and 8q22.1 regions in human breast cancer patients. G3: Genes|Genomes|Genetics (accepted).

  • Rau, A., Flister, M. J., Rui, H. and Livermore Auer, P. (2018) Exploring drivers of gene expression in The Cancer Genome Atlas. Bioinformatics, bty551, doi: https://doi.org/10.1101/227926. Preprint PDF Shiny app Code

  • Godichon-Baggioni, A., Maugis-Rabusseau, C. and Rau, A. (2018) Clustering transformed compositional data using K-means, with applications in gene expression and bicycle sharing system data. Journal of Applied Statistics, 46(1):47-65. Preprint PDF Code ProdInra

  • Rau, A. and Maugis-Rabusseau, C. (2018) Transformation and model choice for RNA-seq co-expression analysis. Briefings in Bioinformatics, bbw128, https://doi.org/10.1093/bib/bbw128. Preprint PDF Code ProdInra

  • Verrier, E., Genet, C., Laloë, D., Jaffrézic, J., Rau, A., Esquerre, D., Dechamp, N., Ciobataru, C., Hervet, C., Krieg, F., Quillet, E., Boudinot, P. (2018) Genetic and transcriptomic analyses provide new insights on the early antiviral response to VHSV in resistant and susceptible rainbow trout. BMC Genomics, 19:482. PDF ProdInra

  • Maroilley, T., Berri, M., Lemonnier, G., Esquerré, D., Chevaleyre, C., Mélo, S., Meurens, F., Coville, J.L., Leplat, J.J, Rau, A., Bed’hom, B., Vincent-Naulleau, S., Mercat, M.J., Billon, Y., Lepage, P., Rogel-Gaillard, C., and Estellé, J. (2018). Immunome differences between porcine ileal and jejunal Peyer’s patches revealed by global transcriptome sequencing of gut-associated lymphoid tissues. Scientific Reports, 8:9077. PDF ProdInra

  • Mondet, F., Rau, A., Klopp, C., Rohmer, M. Severac, D., Le Conte, Y., and Alaux, C. (2018). Transcriptome profiling of the honeybee parasite Varroa destructor provides new biological insights into the mite adult life cycle. BMC Genomics, 19:328. PDF ProdInra

  • He, B., Tjhung, K., Bennett, N., Chou, Y., Rau, A., Huang, J., and Derda, R. (2018). Compositional bias in naïve and chemically-modified phage-displayed libraries uncovered by paired-end deep sequencing. Scientific Reports, 8:1214. PDF ProdInra

2017

  • Monneret, G., Jaffrézic, F., Rau, A., Zerjal, T. and Nuel, G. (2017) Identification of marginal causal relationships in gene networks from observational and interventional expression data. PLoS One 12(3): e0171142. PDF Code ProdInra

  • Sauvage, C., Rau, A., Aichholz, C., Chadoeuf, J., Sarah, G., Ruiz, M., Santoni, S., Causse, M., David, J., Glémin, S. (2017) Domestication rewired gene expression and nucleotide diversity patterns in tomato. The Plant Journal 91(4):631-645. PDF ProdInra

2016

  • Rigaill, G., Balzergue, S., Brunaud, V., Blondet, E., Rau, A., Rogier, O., Caius, J., Maugis-Rabusseau, C., Soubigou-Taconnat, L., Aubourg, S., Lurin, C., Martin-Magniette, M.-L., and Delannoy, E. (2016) Synthetic datasets for the identification of key ingredients for RNA-seq differential analysis. Briefings in Bioinformatics, doi: https://doi.org/10.1093/bib/bbw092. PDF ProdInra

2015

  • Gallopin, M., Celeux, G., Jaffrézic, F., Rau, A. (2015) A model selection criterion for model-based clustering of annotated gene expression data. Statistical Applications in Genetics and Molecular Biology, 14(5): 413-428. PDF Code ProdInra

  • Monneret, G., Jaffrézic, F., Rau, A., Nuel, G. (2015) Estimation d’effets causaux dans les réseaux de régulation génique : vers la grande dimension. Revue d’intelligence artificielle, 29(2): 205-227. PDF Code ProdInra

  • Rau, A., Maugis-Rabusseau, C., Martin-Magniette, M.-L., Celeux, G. (2015) Co-expression analysis of high-throughput transcriptome sequencing data with Poisson mixture models. Bioinformatics, 31(9): 1420-1427. Preprint PDF Code ProdInra

2014

  • Rau, A., Marot, G. and Jaffrézic, F. (2014) Differential meta-analysis of RNA-seq data from multiple studies. BMC Bioinformatics, 15:91. PDF Code ProdInra

  • Endale Ahanda, M.-L., Zerjal, T., Dhorne-Pollet, S., Rau, A., Cooksey, A., and Giuffra, E. (2014) Impact of the genetic background on the composition of the chicken plasma miRNome in response to a stress. PLoS One, 9(12): e114598. PDF Code ProdInra

2013

  • Nuel, G., Rau, A., and Jaffrézic, F. (2013) Using pairwise ordering preferences to estimate causal effects in gene expression from a mixture of observational and intervention experiments. Quality Technology and Quantitative Management 11(1):23-37. PDF ProdInra

  • Rau, A., Jaffrézic, F., and Nuel, G. (2013) Joint estimation of causal effects from observational and intervention gene expression data. BMC Systems Biology 7:111. PDF Code ProdInra

  • Gallopin, M. Rau, A., and Jaffrézic, F. (2013). A hierarchical Poisson log-normal model for network inference from RNA sequencing data. PLoS One 8(10): e77503. PDF ProdInra

  • Rau, A., Gallopin, M., Celeux, G., and Jaffrézic, F. (2013). Data-based filtering for replicated high-throughput transcriptome sequencing experiments. Bioinformatics 29(17): 2146-2152. PDF Code ProdInra

  • Dillies, M.-A.¹, Rau, A.¹, Aubert, J.¹, Hennequet-Antier, C.¹, Jeanmougin, M.¹, Servant, N.¹, Keime, C.¹, Marot, G., Castel, D., Estelle, J., Guernec, G., Jagla, B., Jouneau, L., Laloë, D., Le Gall, C., Schaëffer, B., Charif, D., Le Crom, S.¹, Guedj, M.¹, and Jaffrézic, F¹. (2013). A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. Briefings in Bioinformatics, doi:10.1093/bib/bbs046. PDF ProdInra
    ¹These authors contributed equally to this work.

  • Brenault, P., Lefevre, L. Rau, A., Laloë, D., Pisoni, G., Moroni, P., Bevilacquia, C. and Martin, P. (2013) Contribution of mammary epithelial cells to the immune response during early stages of a bacterial infection to Staphylococcus aureus. Veterinary Research, 45:16. PDF ProdInra

2012 and before

  • Rau, A., Jaffrézic, F., Foulley, J.-L., and Doerge, R. W. (2012). Reverse engineering gene regulatory networks using approximate Bayesian computation. Statistics and Computing, 22: 1257-1271. Preprint PDF ProdInra

  • Rau, A., Jaffrézic, F., Foulley, J.-L., and Doerge, R. W. (2010). An empirical Bayesian method for estimating biological networks from temporal microarray data. Statistical Applications in Genetics and Molecular Biology: Vol. 9: Iss. 1, Article 9. PDF Code ProdInra

  • Furth, A., Mandrekar, S., Tan, A. Rau, A., Felten, S., Ames, M. Adjei, A. Erlichman, C. and Reid, J. (2008). A limited sample model to predict area under the drug concentration curve for 17-(allylamino)-17-demethoxygeldanamycin and its active metabolite 17-(amino)-17-demethoxygeldanomycin. Cancer Chemotherapy Pharmacology, 61(1): 39-45.

Book chapters

  • Martin-Magniette, M.-L., Maugis-Rabusseau, C. and Rau, A. (2017) Clustering of co-expressed genes. In: Model Choice and Model Aggregation. Ed. F. Bertrand, J.-J. Droesbeke, G. Saporta, C. Thomas-Agnan. Publisher link HAL

Submitted and in preparation

  • Revilla, M., Rau, A., Crespo-Piazuelo, D., Ramayo-Caldas, Y., Estellé, J., INIA, Ballester, M., Folch, J. M. (2019) An integrative gene network analysis of the genetic determination of pig fatty acid composition based on adipose tissue RNA sequencing. Submitted.

  • Foissac, S., Djebali, S., Munyard, K., Villa-Vialaneix, N., Rau, A., Muret, K., Esquerre, D., Zytnicki, M., Derrien, T., Bardou, P., Blanc, F., Cabau, C., Crisci, E., Dhorne-Pollet, S., Drouet, F., Gonzales, I., Goubil, A., Lacroix-Lamande, S., Laurent, F., Marthey, S., Marti-Marimon, M., Momal-Leisenring, R., Mompart, F., Quere, P., Robelin, D., San Cristobal, M., Tosser-Klopp, G., Vincent-Naulleau, S., Fabre, S., Pinard-Van der Laan, M.-H., Klopp, C., Tixier-Boichard, M., Acloque, H., Lagarrigue, S., Giuffra, E. (2018) Livestock genome annotation: transcriptome and chromatin structure profiling in cattle, goat, chicken, and pig. bioRxiv, doi: https://doi.org/10.1101/316091. Submitted. Preprint ProdInra

  • Godichon-Baggioni, A., Maugis-Rabusseau, C. and Rau, A. (2018) Multi-view cluster aggregation and splitting, with an application to multi-omic breast cancer data. Submitted. Preprint Code

  • Tsaih, S.-W., Plasterer, C., Lemke, A., Ran, S. Rau, A., Auer, P., Rui, H. and Flister, M. J. (2018) Genetic mapping of pathophysiological modifiers in the breast tumor microenvironment. Submitted.

  • Jehl, F., Klopp, C., Brenet, M., Rau, A., Désert, C., Boutin, M., Leroux, S., Muret, K., Esquerré, D., Gourichon, D., Burlot, T., Pitel, F., Zerjal, T., Lagarrigue, S. (2018) Phenotype and multi-tissue transcriptome response to diet energy change in laying hens. In preparation.

Software

  • maskmeans: Multi-view aggregation/splitting K-means clustering algorithm.
  • Edge in TCGA: An R/Shiny interactive web application for the exploration of drivers of gene expression in The Cancer Genome Atlas.
  • coseq: Co-expression analysis of sequencing data.
  • ICAL: Model selection for model based clustering of annotated data.
  • metaRNASeq: Meta-analysis of RNA-seq data.
  • HTSDiff: Differential analysis for RNA-seq data.
  • HTSFilter: Filter for replicated high-throughput sequencing data.
  • HTSCluster: Clustering high-throughput sequencing data with Poisson mixture models.
  • ebdbNet: Empirical Bayes estimation for dynamic Bayesian networks.

Advising & Teaching

I am an adjunct instructor for the following graduate course at the Medical College of Wisconsin in Spring 2019:

  • MCW Physiological Genomics: Bioinformatics module

I was a teaching instructor for the following course at the University of Wisconsin-Milwaukee in Spring 2018:

  • UWM PH718: Data management and visualization

Alumni:

  • Dr. Gilles Monneret (2014-2018 Ph.D.): “Estimation of causal effects in gene networks from observational and intervention data” (co-supervision with Grégory Nuel and Florence Jaffrézic)
  • Raphaëlle Momal-Leisenring (2017 M2 internship): “Integrative statistical analysis of multi-omics data”
  • Frédéric Jehl (2017 M2 internship): “Impact of heat stress on liver and blood transcriptomes of laying hens” (co-supervision with Tatiana Zerjal)
  • Dr. Manuel Revilla Sanchez (2016 3-month Ph.D. Erasmus+ Learning Mobility): “An integrative gene network analysis of the genetic determination of pig fatty acid composition” (co-supervison with Jordi Estelle and Yuliaxis Ramayo Caldas)
  • Babacar Ciss (2016 M2 internship): “Constructing predictive models for ovine production data” (co-supervision with Eli Sellem, Allice)
  • Dr. Mélina Gallopin (2012-2015 Ph.D.): “Clustering and network inference for RNA-seq data” (co-supervision with Gilles Celeux and Florence Jaffrézic) Currently Assistant Professor (maître de conférences) at I2BC, Université Paris-Saclay
  • Audrey Hulot (2015 M1 internship): “Incorporating a priori biological knowledge into gene network inference from observational and intervention gene expression data” (with Florence Jaffrézic)
  • Meriem Benabbas (2015 M1 internship): “Identifying differentially expressed genes from RNA-seq data using mixture models”
  • Rémi Bancal (2012 M2 internship): “Gene network estimation by adaptive knockout experiments” (co-supervision with Grégory Nuel and Florence Jaffrézic)
  • Mélina Gallopin (2012 M2 internship): “Gene network inference from RNA sequencing expression data” (co-supervision with Gilles Celeux and Florence Jaffrézic)

CV

Find my full CV in PDF here.

Posts

The start of a new year is always a nice time to look back and take stock of the past year, and look forward and set some goals for the coming year. I spent the entirety of 2018 as an AgreenSkills+ Visiting Scholar at UWM in Milwaukee, Wisconsin, which has been (and continues to be!) a very rich experience that has given me the chance to broaden my understanding of statistical genetics and genomics and expand my skill set.

CONTINUE READING

This is a short post to provide details on how I created the visual CV that is included on my homepage. I got the idea for doing this from a tweet from the awesome Mara Averick about an R package called VisualResume by Nathaniel Phillips: OMG, I love this! (I miss Breaking Bad so much) 📦 “VisualResume: An R package for creating a visual resume” by @YaRrrBook https://t.co/ZNtbrU87Y4 #rstats pic.twitter.com/8KbI06QYKq — Mara Averick (@dataandme) 17 octobre 2018 Once VisualResume is installed from GitHub (via devtools) and loaded, I just modified the Walter White example, resized the plot directly in the RStudio plot window, and exported to PNG.

CONTINUE READING

tl;dr: Use I() to treat a numeric variable in a data.frame “as is” and avoid unintended conversion when mapping to transparency in a ggplot2 aesthetic. Today I ran into a ggplot2 plotting problem involving mapping the transparency aesthetic to a numeric variable – this drove me crazy until I figured it out. Here’s the basic set-up: I wanted to plot a scatterplot of two variables, but have the transparency of the points be controlled by a third (numeric) variable.

CONTINUE READING

I recently decided that I wanted to move my professional homepage from a free page set up on WordPress to GitHub Pages using blogdown by Yihui Xi. There were basically two reasons for this: (1) Because I only sprang for the free WordPress site, there are gigantic, ugly ads that appear on every single page. I only recently realized this as I was usually viewing my WordPress site while being logged on – and apparently, the ads only appear for other people.

CONTINUE READING