Projects

Multi-omic data integration

Integrate multi-omic cancer data representing gene expression, methylation, copy number alterations, somatic mutations, and microRNA expression

RNA-seq co-expression

Identify clusters of co-expressed genes from RNA-seq data

Network inference

Identify gene regulatory networks from transcriptomic data

Recent & Upcoming Talks

Co-expression analyses of RNA-seq data in practice with the R/Bioconductor package coseq
Jun 22, 2018 4:15 PM
Exploring drivers of gene expression in The Cancer Genome Atlas
Mar 28, 2018 8:30 AM

Publications

Book

  1. Albert, I., Ancelet, S., David, O., Denis, J.-B., Makowski, D., Parent, É, Rau, A., and Soubeyrand, S. (2015). Initiation à la statistique bayésienne : Bases théoriques et applications en alimentation, environnmenet, épidémiologie et génétique. Éditions Ellipses, collection références sciences. Code Publisher link

Dissertations

  1. Rau, A. (2017) Statistical methods and software for the analysis of transcriptomic data. HDR (Habilitation à diriger des recherches) thesis, Université d’Évry Val-d’Essonne. PDF Slides
    Note: HDR is a high-level (post-PhD) degree granted by French universities that provides an accreditation to supervise research.

  2. Rau, A. (2010) Reverse engineering gene networks using genomic time-course data. PhD thesis, Purdue University. PDF

Statistical methods

  1. Rau, A., Flister, M. J., Rui, H. and Livermore Auer, P. (2018) Exploring drivers of gene expression in The Cancer Genome Atlas. Bioinformatics, doi: https://doi.org/10.1101/227926 PDF Shiny app Code

  2. Godichon-Baggioni, A., Maugis-Rabusseau, C. and Rau, A. (2018) Clustering transformed compositional data using K-means, with applications in gene expression and bicycle sharing system data. Journal of Applied Statistics, 46(1):47-65. Preprint PDF Code Project

  3. Rau, A. and Maugis-Rabusseau, C. (2018) Transformation and model choice for RNA-seq co-expression analysis. Briefings in Bioinformatics, bbw128, https://doi.org/10.1093/bib/bbw128. Preprint PDF Code Project

  4. Monneret, G., Jaffrézic, F., Rau, A., Zerjal, T. and Nuel, G. (2017) Identification of marginal causal relationships in gene networks from observational and interventional expression data. PLoS One 12(3): e0171142. PDF Code Project

  5. Rigaill, G., Balzergue, S., Brunaud, V., Blondet, E., Rau, A., Rogier, O., Caius, J., Maugis-Rabusseau, C., Soubigou-Taconnat, L., Aubourg, S., Lurin, C., Martin-Magniette, M.-L., and Delannoy, E. (2016) Synthetic datasets for the identification of key ingredients for RNA-seq differential analysis. Briefings in Bioinformatics, doi: https://doi.org/10.1093/bib/bbw092. PDF

  6. Gallopin, M., Celeux, G., Jaffrézic, F., Rau, A. (2015) A model selection criterion for model-based clustering of annotated gene expression data. Statistical Applications in Genetics and Molecular Biology, 14(5): 413-428. PDF Code

  7. Monneret, G., Jaffrézic, F., Rau, A., Nuel, G. (2015) Estimation d’effets causaux dans les réseaux de régulation génique : vers la grande dimension. Revue d’intelligence artificielle, 29(2): 205-227. PDF Code

  8. Rau, A., Maugis-Rabusseau, C., Martin-Magniette, M.-L., Celeux, G. (2015) Co-expression analysis of high-throughput transcriptome sequencing data with Poisson mixture models. Bioinformatics, 31(9): 1420-1427. Preprint PDF Code Project

  9. Rau, A., Marot, G. and Jaffrézic, F. (2014) Differential meta-analysis of RNA-seq data from multiple studies. BMC Bioinformatics, 15:91. PDF Code

  10. Nuel, G., Rau, A., and Jaffrézic, F. (2013) Using pairwise ordering preferences to estimate causal effects in gene expression from a mixture of observational and intervention experiments. Quality Technology and Quantitative Management 11(1):23-37. PDF

  11. Rau, A., Jaffrézic, F., and Nuel, G. (2013) Joint estimation of causal effects from observational and intervention gene expression data. BMC Systems Biology 7:111. PDF Code

  12. Gallopin, M. Rau, A., and Jaffrézic, F. (2013). A hierarchical Poisson log-normal model for network inference from RNA sequencing data. PLoS One 8(10): e77503. PDF

  13. Rau, A., Gallopin, M., Celeux, G., and Jaffrézic, F. (2013). Data-based filtering for replicated high-throughput transcriptome sequencing experiments. Bioinformatics 29(17): 2146-2152. PDF Code

  14. Dillies, M.-A.¹, Rau, A.¹, Aubert, J.¹, Hennequet-Antier, C.¹, Jeanmougin, M.¹, Servant, N.¹, Keime, C.¹, Marot, G., Castel, D., Estelle, J., Guernec, G., Jagla, B., Jouneau, L., Laloë, D., Le Gall, C., Schaëffer, B., Charif, D., Le Crom, S.¹, Guedj, M.¹, and Jaffrézic, F¹. (2012). A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. Briefings in Bioinformatics, doi:10.1093/bib/bbs046. PDF
    ¹These authors contributed equally to this work.

  15. Rau, A., Jaffrézic, F., Foulley, J.-L., and Doerge, R. W. (2012). Reverse engineering gene regulatory networks using approximate Bayesian computation. Statistics and Computing, 22: 1257-1271. Preprint PDF

  16. Rau, A., Jaffrézic, F., Foulley, J.-L., and Doerge, R. W. (2010). An empirical Bayesian method for estimating biological networks from temporal microarray data. Statistical Applications in Genetics and Molecular Biology: Vol. 9: Iss. 1, Article 9. PDF Code

Statistical applications

  1. Verrier, E., Genet, C., Laloë, D., Jaffrézic, J., Rau, A., Esquerre, D., Dechamp, N., Ciobataru, C., Hervet, C., Krieg, F., Quillet, E., Boudinot, P. (2018) Genetic and transcriptomic analyses provide new insights on the early antiviral response to VHSV in resistant and susceptible rainbow trout. BMC Genomics, 19:482. PDF

  2. Maroilley, T., Berri, M., Lemonnier, G., Esquerré, D., Chevaleyre, C., Mélo, S., Meurens, F., Coville, J.L., Leplat, J.J, Rau, A., Bed’hom, B., Vincent-Naulleau, S., Mercat, M.J., Billon, Y., Lepage, P., Rogel-Gaillard, C., and Estellé, J. (2018). Immunome differences between porcine ileal and jejunal Peyer’s patches revealed by global transcriptome sequencing of gut-associated lymphoid tissues. Scientific Reports, 8:9077. PDF

  3. Mondet, F., Rau, A., Klopp, C., Rohmer, M. Severac, D., Le Conte, Y., and Alaux, C. (2018). Transcriptome profiling of the honeybee parasite Varroa destructor provides new biological insights into the mite adult life cycle. BMC Genomics, 19:328. PDF

  4. He, B., Tjhung, K., Bennett, N., Chou, Y., Rau, A., Huang, J., and Derda, R. (2018). Compositional bias in naïve and chemically-modified phage-displayed libraries uncovered by paired-end deep sequencing. Scientific Reports, 8:1214. PDF

  5. Sauvage, C., Rau, A., Aichholz, C., Chadoeuf, J., Sarah, G., Ruiz, M., Santoni, S., Causse, M., David, J., Glémin, S. (2017) Domestication rewired gene expression and nucleotide diversity patterns in tomato. The Plant Journal 91(4):631-645. PDF

  6. Endale Ahanda, M.-L., Zerjal, T., Dhorne-Pollet, S., Rau, A., Cooksey, A., and Giuffra, E. (2014) Impact of the genetic background on the composition of the chicken plasma miRNome in response to a stress. PLoS One, 9(12): e114598. PDF

  7. Brenault, P., Lefevre, L. Rau, A., Lalo&emul;, D., Pisoni, G., Moroni, P., Bevilacquia, C. and Martin, P. (2013) Contribution of mammary epithelial cells to the immune response during early stages of a bacterial infection to Staphylococcus aureus. Veterinary Research, 45:16. [link] PDF

  8. Furth, A., Mandrekar, S., Tan, A. Rau, A., Felten, S., Ames, M. Adjei, A. Erlichman, C. and Reid, J. (2008). A limited sample model to predict area under the drug concentration curve for 17-(allylamino)-17-demethoxygeldanamycin and its active metabolite 17-(amino)-17-demethoxygeldanomycin. Cancer Chemotherapy Pharmacology, 61(1): 39-45.

Book chapters

  1. Martin-Magniette, M.-L., Maugis-Rabusseau, C. and Rau, A. (2017) Clustering of co-expressed genes. In: Model Choice and Model Aggregation. Ed. F. Bertrand, J.-J. Droesbeke, G. Saporta, C. Thomas-Agnan. Publisher link

Submitted and pre-prints

  1. Foissac, S., Djebali, S., Munyard, K., Villa-Vialaneix, N., Rau, A., Muret, K., Esquerre, D., Zytnicki, M., Derrien, T., Bardou, P., Blanc, F., Cabau, C., Crisci, E., Dhorne-Pollet, S., Drouet, F., Gonzales, I., Goubil, A., Lacroix-Lamande, S., Laurent, F., Marthey, S., Marti-Marimon, M., Momal-Leisenring, R., Mompart, F., Quere, P., Robelin, D., San Cristobal, M., Tosser-Klopp, G., Vincent-Naulleau, S., Fabre, S., Pinard-Van der Laan, M.-H., Klopp, C., Tixier-Boichard, M., Acloque, H., Lagarrigue, S., Giuffra, E. Livestock genome annotation: transcriptome and chromatin structure profiling in cattle, goat, chicken, and pig. bioRxiv, doi: https://doi.org/10.1101/316091. Preprint

  2. Jehl, F., Klopp, C., Brenet, M., Rau, A., Désert, C., Boutin, M., Leroux, S., Muret, K., Esquerré, D., Gourichon, D., Burlot, T., Pitel, F., Zerjal, T., Lagarrigue, S. Phenotype and multi-tissue transcriptome response to diet energy change in laying hens. In preparation.

  3. Godichon-Baggioni, A., Maugis-Rabusseau, C. and Rau, A. (2018) Multi-view cluster aggregation and splitting, with an application to multi-omic breast cancer data. In preparation. Code

Software

  • maskmeans: Multi-view aggregation/splitting K-means clustering algorithm.
  • Edge in TCGA: An R/Shiny interactive web application for the exploration of drivers of gene expression in The Cancer Genome Atlas.
  • coseq: Co-expression analysis of sequencing data.
  • ICAL: Model selection for model based clustering of annotated data.
  • metaRNASeq: Meta-analysis of RNA-seq data.
  • HTSDiff: Differential analysis for RNA-seq data.
  • HTSFilter: Filter for replicated high-throughput sequencing data.
  • HTSCluster: Clustering high-throughput sequencing data with Poisson mixture models.
  • ebdbNet: Empirical Bayes estimation for dynamic Bayesian networks.

Advising & Teaching

I was a teaching instructor for the following course at the University of Wisconsin-Milwaukee in Spring 2018:

  • UWM PH718: Data management and visualization

Alumni:

  • Dr. Gilles Monneret (2014-2018 Ph.D.): “Estimation of causal effects in gene networks from observational and intervention data” (co-supervision with Grégory Nuel and Florence Jaffrézic)
  • Raphaëlle Momal-Leisenring (2017 M2 internship): “Integrative statistical analysis of multi-omics data”
  • Frédéric Jehl (2017 M2 internship): “Impact of heat stress on liver and blood transcriptomes of laying hens” (co-supervision with Tatiana Zerjal)
  • Dr. Manuel Revilla Sanchez (2016 3-month Ph.D. Erasmus+ Learning Mobility): “An integrative gene network analysis of the genetic determination of pig fatty acid composition” (co-supervison with Jordi Estelle and Yuliaxis Ramayo Caldas)
  • Babacar Ciss (2016 M2 internship): “Constructing predictive models for ovine production data” (co-supervision with Eli Sellem, Allice)
  • Dr. Mélina Gallopin (2012-2015 Ph.D.): “Clustering and network inference for RNA-seq data” (co-supervision with Gilles Celeux and Florence Jaffrézic) Currently Assistant Professor (maître de conférences) at I2BC, Université Paris-Saclay
  • Audrey Hulot (2015 M1 internship): “Incorporating a priori biological knowledge into gene network inference from observational and intervention gene expression data” (with Florence Jaffrézic)
  • Meriem Benabbas (2015 M1 internship): “Identifying differentially expressed genes from RNA-seq data using mixture models”
  • Rémi Bancal (2012 M2 internship): “Gene network estimation by adaptive knockout experiments” (co-supervision with Grégory Nuel and Florence Jaffrézic)
  • Mélina Gallopin (2012 M2 internship): “Gene network inference from RNA sequencing expression data” (co-supervision with Gilles Celeux and Florence Jaffrézic)

CV

Find my full CV in PDF here.

Posts

This is a short post to provide details on how I created the visual CV that is included on my homepage. I got the idea for doing this from a tweet from the awesome Mara Averick about an R package called VisualResume by Nathaniel Phillips: OMG, I love this! (I miss Breaking Bad so much) 📦 “VisualResume: An R package for creating a visual resume” by @YaRrrBook https://t.co/ZNtbrU87Y4 #rstats pic.twitter.com/8KbI06QYKq — Mara Averick (@dataandme) 17 octobre 2018 Once VisualResume is installed from GitHub (via devtools) and loaded, I just modified the Walter White example, resized the plot directly in the RStudio plot window, and exported to PNG.

CONTINUE READING

tl;dr: Use I() to treat a numeric variable in a data.frame “as is” and avoid unintended conversion when mapping to transparency in a ggplot2 aesthetic. Today I ran into a ggplot2 plotting problem involving mapping the transparency aesthetic to a numeric variable – this drove me crazy until I figured it out. Here’s the basic set-up: I wanted to plot a scatterplot of two variables, but have the transparency of the points be controlled by a third (numeric) variable.

CONTINUE READING

I recently decided that I wanted to move my professional homepage from a free page set up on WordPress to GitHub Pages using blogdown by Yihui Xi. There were basically two reasons for this: (1) Because I only sprang for the free WordPress site, there are gigantic, ugly ads that appear on every single page. I only recently realized this as I was usually viewing my WordPress site while being logged on – and apparently, the ads only appear for other people.

CONTINUE READING