UNIVERSIDAD NACIONAL AUTÓNOMA DE MÉXICO
DOCTORADO EN CIENCIAS BIOMÉDICAS
INSTITUTO DE ECOLOGÍA
LA DECISIÓN DEL DESTINO CELULAR COMO UNA PROPIEDAD EMERGENTE EN UN
PAISAJE EPIGENÉTICO: MODELOS  DINÁMICOS DE CIRCUITOS Y MÓDULOS GENÉTICOS
TESIS
QUE PARA OPTAR POR EL GRADO DE:
DOCTOR EN CIENCIAS
PRESENTA:
JOSÉ DÁVILA VELDERRAIN
TUTOR PRINCIPAL:
DRA. MARÍA ELENA ALVAREZ-BUYLLA ROCES
INSTITUTO DE ECOLOGÍA, UNAM
MIEMBROS DEL COMITÉ TUTOR:
DR. CARLOS VILLARREAL LUJÁN
INSTITUTO DE FÍSICA, UNAM
DR. HERNÁN LARRALDE RIDAURA
INSTITUTO DE CIENCIAS FÍSICAS, UNAM
MEXICO DF, AGOSTO 2015 
 
UNAM – Dirección General de Bibliotecas 
Tesis Digitales 
Restricciones de uso 
  
DERECHOS RESERVADOS © 
PROHIBIDA SU REPRODUCCIÓN TOTAL O PARCIAL 
  
Todo el material contenido en esta tesis esta protegido por la Ley Federal 
del Derecho de Autor (LFDA) de los Estados Unidos Mexicanos (México). 
El uso de imágenes, fragmentos de videos, y demás material que sea 
objeto de protección de los derechos de autor, será exclusivamente para 
fines educativos e informativos y deberá citar la fuente donde la obtuvo 
mencionando el autor o autores. Cualquier uso distinto como el lucro, 
reproducción, edición o modificación, será perseguido y sancionado por el 
respectivo titular de los Derechos de Autor. 
 
  
 
...
Agradecimientos
Primeramente me gustaŕıa agradecer a mi asesora Elena por siempre tener sus puertas abiertas
para preguntas o discusiones; en especial por su apoyo, confianza, y la coordinacion de todos
los proyectos de investigacion – y por siempre respaldar mis inquietudes cient́ıficas. También
me gustaŕıa agradecer a Jorge Armando Verdusco Mart́ınez por ser un gran maestro, colega y
amigo durante y desde la licenciatura. A Juan Carlos Mart́ınez Garćıa por compartir horas de
discusiones interesantes y por su colaboración en mi trabajo.
También me gustaŕıa expresar aqúı mi gratitud a aquellos autores cuyos trabajos han moti-
vado en gran medida mi gusto por la ciencia en general, y por el enfoque de sistemas complejos
a la bioloǵıa en particular. Principalmente a: Sui Huang, Stuart Kauffman y Kunihiko Kaneko
– a quienes no tengo el gusto de conocer personalmente.
Por último gracias a:
Mis co-asesores Carlos Villarreal y Hernán Larralde.
Mi co-asesor Stephan Ossowski, quien me apoyó durante una estancia en EMBL–CRG unit.
Mis colegas del Laboratorio de Genética y Evolución de Plantas.
Mi colega Shalu Jhanwar de EMBL–CRG unit.
Mis coautores
Mi familia
Agradezco el apoyo financiero del CONACYT.
iv
Contents
1 Introducción General 1
1.1 ¿Por qué estudiar la decisión del destino celular? . . . . . . . . . . . . . . . . . . 1
1.2 Definición del Problema de Estudio . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Esquema General de la Tesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Información de Art́ıculos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Introdución al Marco Teórico-Conceptual 8
2.1 Art́ıculo I: Linear causation schemes in post-genomic biology: the subliminal
and convenient one-to-one genotype-phenotype mapping assumption
(published in INTERdisciplina, 3(5)) . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Art́ıculo II: Bridging the Genotype and the Phenotype: Towards An Epigenetic
Landscape Approach to Evolutionary Systems Biology
(published in Frontiers in Ecology, Evolution and Complexity. CopIt ArXives,
2014) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3 Metodoloǵıa 39
3.1 Art́ıculo III: Gene Regulatory Network Models for Floral Organ Determination
(published in Flower Development (pp. 441-469). Springer) . . . . . . . . . . . . 40
3.2 Art́ıculo IV: Descriptive vs. Mechanistic Network Models in Plant Development
in the Post-Genomic Era
(published in Plant Functional Genomics: Methods and Protocols, (pp. 455-479),
Springer) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
v
3.3 Art́ıculo V: Modeling the epigenetic attractors landscape: toward a post-
genomic mechanistic understanding of development
(published in Frontiers in Genetics - Systems Biology, 6, 160) . . . . . . . . . . . 96
4 Resultados 111
4.1 Art́ıculo VI: Reshaping the epigenetic landscape during early flower develop-
ment: induction of attractor transitions by relative differences in gene decay rates
(in press in BMC Systems Biology) . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.2 Art́ıculo VII: Dynamic network and epigenetic landscape model of a regulatory
core underlying spontaneous immortalization and epithelial carcinogenesis
(submitted to Journal of the Royal Society Interface) . . . . . . . . . . . . . . . . 127
4.3 Art́ıculo VIII: Methods for Characterizing the Epigenetic Attractors Land-
scape Associated with Boolean Gene Regulatory Networks
(in preparation for Frontiers in Genetics - Bioinformatics and Computational
Biology) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
4.4 Art́ıculo XI: Molecular evolution constraints in the floral organ specification
gene regulatory network module across 18 angiosperm genomes
(published in Molecular biology and evolution, 31(3), 560-573) . . . . . . . . . . 172
4.5 Art́ıculo X: XAANTAL2 (AGL14) Is an Important Component of the Complex
Gene Regulatory Network that Underlies Arabidopsis Shoot Apical Meristem
Transitions
(published in Molecular Plant, 8(5), 796-813) . . . . . . . . . . . . . . . . . . . . 187
5 Conclusiones 206
vi
La Decisión Del Destino Celular Como Una Propiedad Emergente En Un
Paisaje Epigenético:
Modelos Dinámicos De Circuitos Y Módulos Genéticos
por
José Dávila Velderrain
Resumen
De igual manera que los humanos tomamos decisiones, las células que constituyen al humano
también toman decisiones – las cuales son requeridas para producir al humano en primer lugar.
Durante el desarrollo de organismos multicelulares las células deciden acerca de sus destinos
mientras proliferan. A diferencia de los humanos, sin embargo, las células no tienen conciencia.
¿Cómo podemos entonces entender las decisiones sobre el destino celular durante el desarrollo
como una consecuencia natural del funcionamiento celular interno? Esta tésis tiene el propósito
de presentar ideas clarificadoras acerca de esta pregunta general. Particularmente, sobre como
se puede explotar la claridad conceptual de un modelo metafórico de hace más de medio siglo,
el Paisaje Epigenético de Waddington, con el objetivo de formular modelos mecanicistas sobre
la decisión del destino celular basados en el papel organizacional de redes regulatorias genéticas
subyacentes. Decisiones del destino celular resultan como una consecuencia de estos sistemas
moleculares regulatorios, como tal, se espera que su papel organizacional sea un evento persis-
tente durante la evolución. Un segundo propósito en esta tesis es estudiar la historia evolutiva
y la relevancia de la conservación de una red regulatoria genética bien caracterizada y validada:
la red de regulación genética del establecimiento del destino celular en la flor de Arabidopsis
thaliana.
Métodos Se hace uso extensivo de modelos de redes regulatorias genéticas y análisis matemáticos
de su dinámica. Modelos convencionales de redes regulatorias genéticas son extendidos para
proponer un grupo de modelos dinámicos definido aqúı como modelos del Paisaje Epigenético
de Atractores. Para los análisis evolutivos se utilizan métodos estad́ısticos que permiten inferir
el papel de distintos tipos de selección natural en linajes y sitios espećıficos para genes de un
módulo regulatorio conservado en las plantas angiospermas.
Resultados y Conclusiones De manera global, reportamos dos resultados principales: (1)
integramos propuestas de modelado necesarias para sustanciar la propuesta de que el grupo de
modelos definido aqúı como modelos del Paisaje Epigenético de Atractores constituyen la ex-
tensión más natural para el protocolo ya establecido de modelado de redes regulatorias genéticas,
y una adición valiosa para las herramientas de la bioloǵıa de sistemas. (2) Presentamos eviden-
cia que indica que la relevancia funcional de redes regulatorias que especifican destinos celulares
en la dinámica del desarrollo restringe su capacidad de sufrir un alto grado de variación durante
la evolución. En otras palabras, los módulos regulatorios del desarrollo parecen ser procesos
vii
clave que se encuentran conservados (no estń cambiando) en sistemas biológicos que presentan
un procesos de desarrollo.
viii
La Decisión Del Destino Celular Como Una Propiedad Emergente En Un
Paisaje Epigenético:
Modelos Dinámicos De Circuitos Y Módulos Genéticos
by
José Dávila Velderrain
Abstract
Much as humans make decisions during their lives, the cells that constitute the human also make
decisions – which are required to produce the human in the first place. During the development
of a multicellular organism cells decide about their fate while proliferating. Unlike humans,
however, cells do not have consciousness. How are we to understand cell-fate choices during
development as a natural consequence of their inner-workings? The present thesis is meant to
provide insights into this general question. Particularly, into how we can exploit the conceptual
clarity of a half-century old metaphoric model, Waddington’s Epigenetic Landscape, in order to
derive mechanistic, post-genomic models of cell-fate decision based on the orchestrating role of
underlying gene regulatory networks. Cell-fate decisions result as a natural consequence of such
molecular regulatory systems, as such, their orchestrating role is expected to be a persistent
event during evolution. A second major concern in this thesis is the evolutionary history and
relevance of gene regulatory networks persistence.
Methods Models of gene regulatory networks and conventional mathematical analyzes of
their dynamics are extensively used through the thesis. Conventional models of gene regu-
latory networks are extended in order to propose a group of dynamical models defined here
as Epigenetic Attractors Landscape models. Conventional molecular evolutionary analysis are
used.
Results and conclusions Overall, we report two main results: (1) we present the necessary
background and modeling proposals to substantiate the claim that the group of models defined
here as Epigenetic Attractors Landscape models are the most natural extension to the already
established protocol of gene regulatory network modeling, and a valuable addition to the sys-
tems biology toolkit. (2) We present evidence indicating that the functional relevance of gene
regulatory networks specifying cell-fates in developmental dynamics precludes them for having
a high degree of variation during evolution. In other words, developmental regulatory modules
seem to be key conserved, unchanging processes in biological systems undergoing development.
Keywords Gene Regulatory Networks, Epigenetic Landscape, Systems Dynamics, Epigenetic
Attractors Landscape, Evolutionary Systems Biology.
ix
Chapter 1
Introducción General
The generality of the paradox
... that the more facts we learn the less we understand the process we study ...
suggested some common fundamental flaw of how biologists approach problems.
— Yuri Lazebnik, Can a biologist fix a radio? (2002)
1.1 ¿Por qué estudiar la decisión del destino celular?
Este proyecto es el resultado de tres inquietudes principales que se fueron desarrollando durante
mis estudios – las cuales se fueron concretizando en gran medida gracias a los antecedentes
producidos y las discusiones llevadas acabo en el Laboratorio de Genética y Evolución de
plantas del Instituto de Ecoloǵıa. Estas inquietudes, aunque un tanto dispersas a primera
vista, se relacionan dada su intersección con el problema general sobre el entendimiento del
origen y la regulación del proceso de desarrollo de un organismo multicelular; particularmente,
el proceso de diferenciación celular – i.e., la decisión del destino celular.
La primera inquietud se puede expresar de la siguiente manera: dado que cada célula
de un organismo multicelular contiene el mismo conjunto de genes (y el mismo genoma), y
considerando que todas sus células se originan de una sola, ¿cómo es que durante el desarrollo las
células adquieren diferentes fenotipos celulares de manera robusta y reproducible? Y, por otro
lado, ¿como es que el desarrollo de enfermedades degenerativas a edades avanzadas presenta
manifestaciones fenot́ıpicas anormales, pero estas, de forma similar, se manifiestan también
de manera robusta y reproducible? Estas observaciones sugieren que existe un mecanismo
subyacente que de alguna manera regula este comportamiento y que no depende directamente
1
de la presencia o ausencia de genes individuales.
La segunda inquietud es de naturaleza metodológica y distingue dos aspectos, uno concep-
tual y uno práctico. Por un lado, ¿existe algún marco teórico-conceptual que permita discutir de
manera concreta los problemas expresados arriba? Por otro lado, ¿contamos con herramientas
teóricas suficientes para lograr un entendimiento de las observaciones mas allá de la descripción?
¿Cómo podemos formalizar las preguntas en modelos con fines predictivos? ¿Es necesario pro-
poner nuevas herramientas?
Por último, la tercera inquietud surge por deducción lógica a partir del supuesto enunci-
ado en la primera inquietud: si existe un mecanismo subyacente que regula la diferenciación
celular durante el desarrollo, este mecanismo debió haber surgido en etapas tempranas de la
muticelularidad; por lo tanto, es razonable pensar que este mecanismo se encuentra conservado
en organismos que manifiestan un proceso de desarrollo similar. ¿Existe evidencia de esto?
En particular, para explorar esta ultima pregunta en la presente tesis se estudia un sistema
biológico espećıfico: el desarrollo temprano de la flor (ver abajo).
1.2 Definición del Problema de Estudio
En este proyecto las inquietudes generales expresadas en la sección anterior se definen de manera
concreta y operacional de la siguiente manera.
El problema de la decisión del destino celular A nivel celular, tanto el desarrollo normal
como el desarrollo de enfermedades degenerativas involucra múltiples eventos de diferenciación
celular. Espećıficamente, una célula con la capacidad de adquirir más de un fenotipo discreto
al diferenciarse, en cada evento de diferenciación adquiere solo un fenotipo. Para el propósito
de este proyecto, el problema de la decisión del destino celular consiste en entender: (1) como
se establecen estos fenotipos potenciales y (2) la forma en que la célula en cuestión adquiere un
fenotipo y no otro. Se asume que este comportamiento resulta de manera natural a partir del
funcionamiento interno de la célula que se pretende estudiar por medio del modelado matemático
y computacional.
2
La dinámica de redes regulatorias como un mecanismo subyacente Este proyecto
toma como principal hipótesis de trabajo lo siguiente: la acción concertada de los genes y sus
interacciones representadas en redes regulatorias genéticas restringe el comportamiento permis-
ible de las células, y como resultado de estas restricciones los destinos celulares potenciales son
especificados. En relación a esta hipótesis, se considera que la perspectiva del comportamiento
celular basada en la teoŕıa de sistemas dinámicos brinda un marco teórico-conceptual formal y
concreto que permite estudiar el problema de la toma de decisión celular de manera natural.
Modelos del Paisaje Epigenético asociado a redes regulatorias genéticas A pesar
de contar con diversos modelos establecidos para el modelado de redes regulatorias genéticas,
estos cuentan con limitaciones cuando se intenta abordar de manera natural las preguntas más
relevantes para el problema de la toma de decisión celular. En linea con antecedentes recientes
en el modelado de la diferenciación celular, en este proyecto se propone el uso de extensiones
de modelos dinámicos de redes regulatorias genéticas con la intensión de modelar un paisaje
epigenético subyacente. Los modelos resultantes permiten abordar el problema de manera
natural.
Conservación evolutiva de una red regulatoria genética Motivados por la tercer inqui-
etud sobre la existencia y conservación de un mecanismo subyacente a un proceso de desarrollo
robusto, en este proyecto se plantea la hipótesis de que la red regulatoria genética caracteri-
zada como orquestador de un proceso de diferenciación celular se encuentra conservada entre
especies que manifiestan el mismo proceso. En particular, se prueba esta hipótesis utilizando
como sistema de estudio el desarrollo temprano de la flor. Dado que el patrón floral en términos
de tipos de órganos de la flor y organización espacio-temporal de los mismos están conservados
en prácticamente todas las plantas angiospermas, originalmente se infirió la existencia de un
mecanismo subyancente robusto. Se probó la existencia de tal mecanismo mediante la propuesta
de una red regulatoria genética descrita originalmente en Arabidopsis thaliana. Considerando
este antecedente, en el presente trabajo exploramos (1) si los componentes de la red regulatoria
genética están conservados a nivel de secuencia en las plantas angiospermas para las cuales
se ha secuenciado el genoma y (2) si existe evidencia molecular sugerente de la ocurrencia de
restricciones funcionales a la evolución molecular de la red regulatoria.
3
1.3 Esquema General de la Tesis
La tesis está estructurada de la siguiente manera. Los Art́ıculos I y II forman una parte concep-
tual en donde se introduce la perspectiva de la dinámica de una red regulatoria genética como
un mecanismo cuyas restricciones regulatorias especifican los fenotipos celulares observables y
por lo tanto los destinos celulares. Se discute esta perspectiva en el contexto del abordaje con-
vencional de la bioloǵıa molecular y la bioloǵıa evolutiva. Los detalles técnicos sobre los métodos
necesarios para el estudio y análisis de modelos dinámicos de redes regulatorias genéticas se
presentan en los Art́ıculos III y IV. El Art́ıculo V, por otro lado, argumenta que el problema de
la toma de decisión celular requiere de metodoloǵıas que van mas allá de los modelos conven-
cionales de redes regulatorias genéticas. En particular, se propone la integración de modelos
propuestos recientemente bajo el marco teórico de la formalización del Paisaje Epigenético de
Atractores a partir de los modelos dinámicos de redes regulatorias genéticas como un enfoque
natural para estudiar el problema. Los Art́ıculos III, IV y V entonces exponen el componente
metodológico principal que es aplicado en los siguientes art́ıculos de la tesis. Cabe destacar
que estos tres art́ıculos ofrecen antecedentes importantes sobre el modelaje en general y sobre
modelos dinámicos en particular, ofreciendo una introducción técnica al modelado en bioloǵıa.
Adicionalmente, los Art́ıculos III, IV y V – basados en el lenguaje del estudio de sistemas com-
plejos – introducen el marco teórico-conceptual necesario para abordar de manera concreta el
problema de la toma de decisiones celulares.
En el resto de la tesis se presenta la aplicación a sistemas biológicos particulares del marco
conceptual y los métodos desarrollados en los art́ıculos anteriores. En el Articulo VI se toma
uno de los modelos de redes de regulación genética mas estudiados – i.e., la red de regulación
genética del establecimiento de los destinos celulares durante el desarrollo temprano de la flor de
Arabidopsis – y se presenta un marco metodológico integrativo para estudiar el papel funcional
de genes individuales en el contexto de la toma de decisiones celulares mediante modificaciones
estructurales al Paisaje Epigenético subyacente. En el Articulo VII se extiende el abordaje al
estudio de otro orgasnismo multicelular, el humano, y en particular a los procesos celulares
durante el desarrollo de una manifestación patológica: la carcinogenesis. En este art́ıculo se
propone un nuevo modelo de red regulatoria genética como mecanismo genérico subyacente en el
establecimiento de los fenotipos celulares observados durante la transformación tumorigénica de
4
lineas celulares epiteliales, y se analiza su Paisaje Epigenético asociado. En el Articulo VIII se
presenta una implementación novedosa de los métodos para el modelaje del Paisaje Epigenético
asociado a redes regulatorias genéticas. En el Articulo IX, se presenta un enfoque emṕırico para
abordar la pregunta sobre la conservación evolutiva de un mecanismo subyacente en la deter-
minación del destino celular. Espećıficamente, se toma nuevamente como sistema de estudio
el proceso de desarrollo temprano de la flor, y mediante el uso de las secuencias genómicas
disponibles de plantas con flor, se prueba la conservación de la red de regulación genética tanto
en composición de genes como en propiedades de secuencia. Por último, en el Articulo IX se
presenta el producto de la colaboración con la división experimental del laboratorio: se propone
un nuevo modelo de red regulatoria genética y su asociado paisaje epigenético de atractores
a partir de datos experimentales originales obtenidos en el laboratorio. Se muestra como tal
interacción teórico-experimental permite generar una explicación mecanicista a los eventos de
transición observados en el meristemo de flor.
A lo largo de la tesis nos referimos a todas las publicaciones de manera genérica como
Artćulos, sin distinguir entre su naturaleza espećıfica.
En resumen, los objetivos concretos del proyecto general de tesis fueron:
• Contrastar la perspectiva de un modelo conceptual de mapeo de genotipo a fenotipo uno
a uno con la perspectiva de un modelo de mapeo en términos del rol auto-organizacional
de redes regulatorias genéticas (Articulo I).
• Proponer el modelo del Paisaje Epigenético asociado a una GRN como un marco teórico
para el estudio del efecto que tiene la generación de variación fenot́ıpica durante el desar-
rollo en la evolución (Articulo II).
• Revisar y explicar los aspectos prácticos y metodoloǵıas involucradas en el planteamiento,
formalización y análisis de redes regulatorias genéticas (Articulo III).
• Describir y comparar los enfoques mecanicista y descriptivo (inferencial) en el modelado
de redes regulatorias genéticas, con énfasis en terminoloǵıa y aspectos prácticos asociados
(Articulo IV).
5
• Introducir el término de Paisaje Epigenético de Atractores como la formalización del
modelo conceptual del Paisaje Epigenético de Waddington en el contexto de las redes
regulatorias genéticas y la teoŕıa de sistemas dinámicos. Revisar y discutir las estrategias
de modelado del Paisaje Epigenético de Atractores (Articulo V).
• Proponer un marco metodológico para extender modelos de redes regulatorias genéticas
con la intensión de investigar el impacto de perturbaciones a genes espećıficos en la toma de
decisión celular como resultado de la re–estructuración del Paisaje Epigenético subyacente
(Articulo VI).
• Integrar datos experimentales para proponer un modelo de red de regulación genética
para el proceso de transformación tumorigénica in vitro por inmortalizacion espontanea.
Mediante el análisis dinámico de la red y su Paisaje Epigenético subyacente, probar si los
componentes moleculares y sus interacciones son necesarios y suficientes para recuperar
los destinos celulares y transiciones observadas in-vitro e in-vivo(Articulo VII).
• Proponer una implementación novedosa de los métodos de modelaje del Paisaje Epi-
genético de Attractores asociado a redes regulatorias genéticas y hacerla disponible a la
comunidad cient́ıfica (Articulo VIII).
• Probar si los componentes de la red de regulación genética del establecimiento de los
destinos celulares durante el desarrollo temprano de la flor de Arabidopsis se encuentran
conservados a nivel molecular a lo largo de las plantas con flor. Probar si existe eviden-
cia de que el módulo regulatorio ha sido sometido a restricciones funcionales durante la
evolución (Articulo IX).
1.4 Información de Art́ıculos
Articulo I: ensayo publicado en la revista INTERdisciplina , UNAM [Dávila-Velderrain y
Álvarez-Buylla Roces].
Articulo II: caṕıtulo publicado en el libro Frontiers in Ecology, Evolution and Complexity,
CopIt ArXives [Davila-Velderrain et al., 2014a].
6
Articulo III: caṕıtulo publicado en el libro Flower Development, Springer [Azpeitia et al.,
2014].
Articulo IV: caṕıtulo publicado en el libro Plant Functional Genomics: Methods and Proto-
cols, Springer [Davila-Velderrain et al., 2015a].
Articulo V: art́ıculo publicado en la revista Frontiers in Genetics - Systems Biology [Davila-
Velderrain et al., 2015b].
Articulo VI: art́ıculo en prensa en la revista BMC Systems Biology.
Articulo VII: art́ıculo sometido a la revista Journal of The Royal Society Interface.
Articulo VIII: caṕıtulo en preparación para ser sometido a la revista Frontiers in Genetics -
Bioinformatics and Computational Biology [Davila-Velderrain et al., 2014a].
Articulo IX: art́ıculo publicado en la revistaMolecular Biology and Evolution [Davila-Velderrain
et al., 2014b].
Articulo X: art́ıculo publicado en la revista Molecular Plant [Pérez-Ruiz et al., 2015].
7
Chapter 2
Introdución al Marco
Teórico-Conceptual
...biology is finally ready for its own “theory branch”
— Arthur D Lander, The edges of understanding (2010)
8
267
D O S S I E R
 * Instituto de Ecología-Universidad Nacional Autónoma de México . E-mail: jdjosedavila@
gmail .com
** Centro de Ciencias de la Complejidad-Universidad Nacional Autónoma de México . 
 E-mail: eabuylla@gmail .com
Dávila-Velderrain, José and Elena Álvarez-Buylla. «Linear Causation Schemes in Post-genomic Biology: The Subliminal and 
Convenient One-to-one Genotype-Phenotype Mapping Assumption.» Interdisciplina 3, no 5 (2015): 267-280.
José Dávila-Velderrain* and Elena Álvarez-Buylla Roces**
Linear Causation Schemes in Post-genomic 
Biology: The Subliminal and Convenient 
One-to-one Genotype-Phenotype Mapping 
Assumption
Abstract | In this essay we question the validity of basic assumptions in molecular biology 
and evolution on the basis of recent experimental data and through the lenses of a systems 
and nonlinear perspective . We focus our discussion on two well-established foundations of 
biology: the flow of information in molecular biology (i .e ., the central dogma of molecular 
biology), and the “causal” linear signaling pathway paradigm . Under both paradigms the 
subliminal assumption of a one-to-one genotype-phenotype mapping (GPM) constitutes an 
underlying working hypothesis in many cases . We ask if this is empirically sustainable in 
post-genomic biology . We conclude that when embracing the notion of complex networks 
and dynamical processes governing cellular behavior — a view now empirically validat-
ed — one-to-one mapping can no longer be sustained . We hypothesize that such subliminal 
and sometimes explicit assumption may be upheld, to a certain degree, because it is conve-
nient for the private appropriation and marketing of scientific discoveries . Hopefully, our 
discussion will help smooth the undergoing transition towards a more integrative, explan-
atory, quantitative and multidisciplinary systems biology . The latter will likely also yield 
more preventive and sustainable medical and agricultural developments, respectively, than 
a reductionist approach .
Keywords | post-genomic biology – genotype-phenotype mapping – genetic determinism – 
flow of genetic information
Introduction
SCIENCE IS MOSTLY PRACTICED out of consensus . Scientific progress, however, is also 
sustained by the continual challenge to accepted ideas . Unstated agreements 
break from time to time, and then — some say — a transition, a so-called paradigm 
268
Volumen 3 | Número 5 | enero-abril 2015
D
O
S
S
I
E
R
INTERdisciplina
shift, occurs (Kuhn 2012 [1962]) . In the last decades, several authors have dis-
cussed the possibility of a paradigm shift in biology, given the apparent crisis 
of some of its foundational principles . (Wilkins 1996; Strohman 1997; O’Malley 
and Boucher 2005) . In this paper, we would instead like to substantiate that a 
large portion of mainstream biological research subliminally embraces particu-
lar assumptions that are empirically unsustainable in this post-genomic era . 
Some of these assumptions are so deeply rooted that they still permeate the de-
sign, interpretation and description of a 
wide range of biological research at the 
molecular level, although, if explicitly 
confronted, anyone would dismiss them . 
Routinely we look for single, “causal” mu-
tations responsible for complex pheno-
types and assume that by finding the mo-
lecular basis of a mutation that is 
correlated to a particular condition, the 
emergence of the latter is explained . Im-
portantly, such rationale implies that in 
most cases a one-to-one relationship will 
be possible . By extending such assump-
tions we define signaling pathways as au-
tonomous entities instructing the cell 
how to behave under a particular condi-
tion . If pathological behavior arises, we 
look for the source of incorrect instruc-
tions: the mutated component or path-
way . We automatically interpret any man-
ifestation of a learned feature, such as 
drug resistance, as the consequence of 
the optimization principles of (Darwin-
ian) adaptation by means of “random” 
mutation and selection . Is this recurrent 
bias towards ad hoc explanations based 
solely on plausibility given the evidence, 
or is it the mere consequence of a naively 
inherited tradition? We consider that an explicit presentation of some of the as-
sumptions in light of post-genomic empirical data, and through the lenses of a 
systems, nonlinear perspective to biology, will clarify this question . This may 
prove useful for current biology students and scientists interested in multidis-
ciplinary research .
We also include in the term 
post-genomic several features 
that characterize modern 
biology: (1) abundance of 
experimental molecular data, 
(2) access to systematic ways 
of characterizing cellular 
phenotypic states, and (3) a 
tendency to produce 
quantitative data and to 
formulate mathematical/
computational models. 
Consequently, in our view, 
post-genomic biology is 
necessarily multidisciplinary, 
integrative, formal, and 
quantitative
269
José Dávila-Velderrain and Elena Álvarez-Buylla Linear Causation Schemes in...
D
O
S
S
I
E
R
A first necessary detour: What do we mean by post-genomic biology? The 
availability of complete genome sequences (and also transcriptomes, pro-
teomes, metabolomes, etc) obviously impacted biological research, enabling 
new levels of interrogation –as well as unmasking new sources of empirical sup-
port (rejection) for otherwise assumed facts . Here, however, besides access to 
genome-wide data, we also include in the term post-genomic several features 
that characterize modern biology: (1) abundance of experimental molecular 
data, (2) access to systematic ways of characterizing cellular phenotypic states, 
and (3) a tendency to produce quantitative data and to formulate mathematical/
computational models . Consequently, in our view, post-genomic biology is nec-
essarily multidisciplinary, integrative, formal, and quantitative .
The Most Basic, Naive Assumption: The One-to-One GPM
Nowadays, it is common to think about the relationship between genotypes and 
phenotypes in terms of some kind of complex mapping (Kauffman 1993; Mendo-
za and Álvarez-Buylla 1998; Wagner and Zhang 2011; Davila-Velderrain and Ál-
varez-Buylla 2014; Ho and Zhang 2014) . The concept of a “genotype-phenotype 
map” can be traced back to Alberch, who elegantly proposed a model based on 
the principles of systems dynamics to express the inadequacy of what some call 
(molecular) genetic determinism, i .e ., the assumption that genes directly deter-
mine phenotypes (Alberch 1991) . Equally limited would be to assume an epigen-
etic determinism . Importantly, such A gene-centered assumption is the concep-
tual basis of the often invoked metaphors of a ‘genetic blueprint’ or a ‘genetic 
program’ (Pigliucci 2010) . Furthermore, it also implies a linear relationship bet-
ween genotypes and phenotypes; in other words, a one-to-one mapping . This 
simplistic model is attractive, since it naturally embraces a cause-and-effect in-
terpretation, which makes it intuitively appealing . But if we think about this as-
sumption of one genotype specifically producing a particular phenotype, we 
have to address how such a simplistic view can fit any observation . Nonethe-
less, this one-to-one model is still at the basis of most mainstream programs of 
biomedical or biotechnological developments (e .g ., transgenic crops) .
A second necessary detour: what genotype and phenotype? In the epistemo-
logy of evolution and biology, in general, it is common to talk about genotype 
and phenotype as absolute terms . But these can be defined at different le-
vels, and in practice genotype and phenotype distinctions are just partial and 
dynamical (Lewontin 2011) . In post-genomic biology this distinction is com-
monly aided by the use of simple GPM models (see, for example Soyer 2012) . 
Consequently, there is not only one type of genotype and phenotype . A GPM 
model can be specified in different ways . For the sake of this essay we establish 
270
Volumen 3 | Número 5 | enero-abril 2015
D
O
S
S
I
E
R
INTERdisciplina
that the genotype will be represented by a gene regulatory network (GRN) 
and the phenotypes by a gene expression profile or configuration (see below) . 
Nevertheless, it is noteworthy that in the current era of next-generation sequen-
cing (NGS) and single-cell biology, the empirical characterization of the comple-
te genotypes of multiple individual cells is becoming feasible . Unfortunately, 
for both conceptual and technical reasons, the same cannot be said for pheno-
types — although specific systematic phenotyping strategies are under develop-
ment (see, for example Houle et al . 2010; Hancock 2014) . 
One-to-One Genotype-Phenotype Mapping and the Central Dogma
Crick declared “the central dogma of molecular biology” first in 1958 and then 
it was reiterated once again in 1970 (Crick 1958, 1970) . In simple terms, the 
dogma posits that information flows within cells from DNA to RNA to proteins; 
and, as a result, the cellular phenotype is determined (Shapiro 2009) . The sim-
plifications involved in the model have been already questioned from an infor-
mation viewpoint, concluding that discoveries in the last decades have made 
the dogma untenable (Shapiro 2009) . Here we focus instead on the cemented 
role of the dogma regarding the implicitly assumed linear and unidirectional 
scheme of causation of molecular phenotypes . According to an explicit interpre-
tation of the dogma one gene encodes for one protein, which somehow determi-
nes one observable trait (i .e . phenotype) . This simplistic view can be framed 
effectively into a one-to-one GPM model (see Figure 1a) . How do we define a phe-
notype? Here a phenotype is assigned to a molecule, a protein, because it is said 
to have a function . This function should be then an observable characteristic of 
the cell (organism) . Therefore, the first one-to-one GPM to discuss would be: a 
gene (i .e ., the genotype) codes for a protein, which performs a specific function 
that determines an observable characteristic (i .e ., the phenotype) .
Is this One-to-One (Gene-to-Function) Model Empirically Sustainable 
in Post-Genomic Biology?
A first difficulty that we can think of is conceptual in nature . What do we mean 
by function? Defining a function in biology is not trivial (Huang 2000; Huneman 
2013; Brunet and Doolittle 2014, Doolittle et al . 2014) . First of all, the function 
assignment can be given to entities at multiple levels of molecular organization; 
such as gene, protein, protein domain, protein complex, or pathway (Huang 
2000) . In the last years, researchers in the areas of genomics and epigenomics 
are even advocating the mapping of function at genome-wide level and single-
nucleotide resolution (Kellis et al . 2013) . For the sake of concreteness, let us 
271
José Dávila-Velderrain and Elena Álvarez-Buylla Linear Causation Schemes in...
D
O
S
S
I
E
R
just focus on function at the protein level . Although what we define as protein 
function is most of the times conditional on the context –i .e ., cellular environ-
ment– (Huang 2000), for the purpose of our discussion, let us also assume that 
a protein function can be invariably assigned . Thus, in the simple one-to-one 
model, one gene is invariably linked to a specific function through the action of 
a protein .
a)
RNA ProteinDNA
b)
(Cause)
Instructive signal
(Effect)
Cellular phenotype
Genotype Phenotype
Genotypes Phenotypes
G1 G2 G3
P1 P2
P3
...G1 G2 G3
P1
G1 G2 G3
P3
...
G1 G2 G3
P2
...
c)
G = Gene
P = Phenotype
Figure 1. Schematic representation of the GPM exposed in the main text. a) One-to-one GPM model 
representing the central dogma of molecular biology: a gene (i.e., the genotype) codes for a protein, 
which performs a specific function that determines an observable characteristic (i.e., the phenoty-
pe). b)  One-to-one GPM model representing the causal linear signaling pathway paradigm: genes 
code the proteins involved in the pathway (genotype), and these map one specific molecular signal 
(instruction) to a one specific cellular phenotype. c) A non-linear GPM representing cell phenotype 
specification by GRN dynamics: genes in a single genome (genotype) interact in complex GRNs who-
se regulatory interactions ultimately determine observable cell phenotypes.
272
Volumen 3 | Número 5 | enero-abril 2015
D
O
S
S
I
E
R
INTERdisciplina
According to the most recent assembly version of the human genome in En-
sembl database (http://www .ensembl .org/), humans have 20,389 coding genes, 
9,656 small noncoding genes and 14,470 long non-coding genes . A first obvious 
observation is that not all genes code for proteins . Two post-genomic facts: (1) 
most of the human genome is non-protein-coding (Alexander et al . 2010), and 
(2) transcription occurs much more often than anticipated (Carninci et al . 2005; 
Cheng et al . 2005) . Do the genes that do not encode proteins also define a phe-
notype? Well, probably, in some way; but surely not by means of a one-to-one 
GPM, given the emerging view that non-coding transcription is tightly linked to 
gene regulation and cell-type specifica-
tion (Natoli and Andrau 2012) . For exam-
ple, it was recently shown that RNA tran-
scribed from enhancers, the so-called 
eRNA, is able to regulate transcription 
(Plosky 2014) . As we will see below, gene 
regulation in itself is the core mechanism 
behind the definition of gene regulatory 
networks; it is also fundamental for un-
derstanding network collective behavior . 
Conceptualizing cell behavior in terms of 
molecular networks, in turn, represents a 
complete deviation from a one-to-one 
GPM (see below) .
Besides (non)coding genes, the num-
ber of proteins coded in the human geno-
me and represented by transcript modifi-
cations has been estimated to be between 
50,000 and 500,000 (Uhlen and Ponten 
2005) . Considering the now known num-
ber of both genes and (estimated) prote-
ins in other organisms, several authors 
have pointed out that genomic (and pro-
teomic) complexity are not correlated 
with phenotypic complexity (see, for 
example Huang 2002) . This empirical fact again is not consistent with what we 
would expect by extension of the dogma .
Beyond curiosity awakened by newly generated genomic data, a more se-
rious drawback of the one-to-one GPM associated with the central dogma is that 
it completely ignores gene interactions (Tyler et al . 2009) . Epistasis refers to the 
phenomenon in which the functional effect of one gene is conditional on other 
Recent assembly version of the 
human genome in Ensembl 
database, humans have 20,389 
coding genes, 9,656 small 
noncoding genes and 14,470 
long non-coding genes. A first 
obvious observation is that not 
all genes code for proteins. Two 
post-genomic facts: (1) most of 
the human genome is non-
protein-coding and (2) 
transcription occurs much 
more often than anticipated. Do 
the genes that do not encode 
proteins also define a 
phenotype?
273
José Dávila-Velderrain and Elena Álvarez-Buylla Linear Causation Schemes in...
D
O
S
S
I
E
R
genes (Phillips 2008), whereas Pleiotropy refers to one function being affected 
by multiple genes (Stearns 2010); these two phenomena are well-established 
facts (and concepts) in classical and modern genetics (Lehner 2011; Wagner and 
Zhang 2011) . Nowadays such genetic interactions are being studied systemati-
cally at a genomic scale . For example, it is now possible to test millions of diffe-
rent combinations of double mutants and to evaluate their effects on a quanti-
fiable function, as Costanzo and colleagues did using the budding yeast, 
Saccharomyces cerevisiae (Costanzo et al . 2010) . Studies such as this one have 
clearly shown that the effect of one gene on a specific phenotype depends on 
the activity (or lack thereof) of many other genes . In this sense, a genetic inte-
raction is defined on the base of this conditional functional effect . Although a 
careful discussion of epistasis and pleiotropy is beyond the scope of this paper, 
it is noteworthy that such mechanisms are closely related with two undeniable 
types of experimental evidence: (1) very different results can be produced from 
a nearly identical set of genes or the same genotype can produce contrasting 
phenotypes, and (2) virtually identical phenotypic end points can be reached by 
using extremely different genotypes . Evidently, these facts do not fit a one-to-
one GPM . Although seemingly paradoxical, both statements can be perfectly re-
conciled by considering a many-to-many GPM model in which interactions 
among genetic and non-genetic components are explicitly considered; a view 
much more consistent with how living, adaptable systems behave and evolve .
One-to-One Mapping and Signaling Pathways
Extending the one-to-one view to a higher level, molecular biologists apply it to 
associating an altered signaling pathway to a particular phenotypic condition . 
Extracellular signals are transmitted by intermediary to effector proteins; which 
eventually activate the sets of genes responsible for the establishment of “ap-
propriate” phenotypes . Note that the term pathway by itself makes reference to 
a group of events that occur orderly along a single line. Thus, in a sense, this 
multi-molecular model continues the dogmatic idea of linear, unidirectional in-
formation transfer . Thereby, in our view, it also effectively constitutes a one-to-
one GPM (see figure 1b) . Genes encode the proteins involved in the pathway 
(genotype), and these map unto one specific molecular signal (instruction) to 
one specific cellular phenotype . The linear property of signaling pathways also 
implies unidirectional cause-and-effect: a given instructional signal is thought 
to directly cause a phenotypic manifestation . Biologists have traditionally taken 
this simple pathway picture as a valid explanation at the molecular level for 
many cellular phenotypes . Not even a one-to-one approach to associate a net-
work with a phenotype is valid (see below) .
274
Volumen 3 | Número 5 | enero-abril 2015
D
O
S
S
I
E
R
INTERdisciplina
Is this One-To-One (Signal-to-Phenotype) Model Empirically 
Sustainable in Post-Genomic Biology?
Similar questions as the ones raised above can be posed here . For instance, are 
there enough signaling pathways for the number of possible extracelullar cues? 
Is there a direct, one-to-one, relationship among signals and phenotypes? If so, 
why do cellular phenotypes (i .e . cell types) seem to be discrete while, for exam-
ple, signals carried by soluble growth factors display concentrations subject to 
continuous variation? And, more importantly, how and why are cellular pheno-
types maintained after the signal has ceased? As we will explain below, rethink-
ing cell behavior as the result of constraints imposed by regulatory interactions 
of complex molecular networks is useful to address these questions .
The genomic explosion has led to the brute-force characterization of mole-
cular components and their interactions, which are now being integrated in lar-
ge databases (Chatr-aryamontri et al . 2013) . As expected, efforts have also tried 
to classify such components in genome-wide collections of signaling pathways 
in multiple organisms (Schaefer et al . 2009, Croft et al . 2010) . What has been 
learned? Does the exhaustive characterization of pathways enable understan-
ding of cellular phenotypes and their plasticity? In analogy to the failure of the 
pre-genomic prediction that by characterizing all the genes of an organism one 
will understand the genome-encoded rules instructing its behavior; listing mo-
lecular components and their interactions in pathways has only uncovered a 
picture that is much more complex than anticipated . But phenotypic manifesta-
tions are far from being explained by means of linear chains of molecular cau-
sation (Huang 2011) — or, in other words, of linear associations rather than 
 explanatory models .
Decades of experimentation have shown that there is extensive crosstalk 
between the individually characterized signaling pathways . Accordingly, the 
phenomena of epistasis and pleiotropy explained above are naturally extended 
at the pathway level . While several different pathways can converge to specific 
phenotypes, one specific pathway and molecular signal can also produce diffe-
rent phenotypes depending on the context (Huang 2000) . These observations 
suggest cross interactions beyond linear cascades . On the other hand, an effect 
similar to the one “caused” by a specific molecular signal can be produced by 
nonspecific stimuli or even in a stimulus-independent manner . For example, 
mechanical stimuli such as those induced by cell shape alterations can induce 
specific cell phenotypes without any molecular elicitor or genetic change (Huang 
2000) . On the other hand, given the intrinsic stochasticity of both extra- and 
 intra-cellular biochemical reactions, cells in a lineage-specific manner can assu-
me different and heritable phenotypes either in the absence of an associated 
genetic or environmental difference or by processing stochastic, nonspecific 
275
José Dávila-Velderrain and Elena Álvarez-Buylla Linear Causation Schemes in...
D
O
S
S
I
E
R
 environmental cues (Perkins and Swain 2009; Balázsi et al . 2011) . These facts 
render a mechanistic explanation by means of the one-to-one GPM at the pathway 
level untenable, as well . The inevitable plasticity of cell behavior and the ro-
bustness of observed phenotypic manifestations call for an alternative explana-
tory model . We argue below that the formal perspective of cell behavior as an 
emergent property of the constraints imposed by gene regulatory networks pro-
vides an alternative view to how genotypes map unto phenotypes, providing a 
starting point for addressing otherwise highly complex processes .
Beyond the One-to-One GPM: A Network Dynamics Perspective 
How do the two views (gene and signaling pathway to function one-to-one map-
ping) above stand in post-genomic, systems biology? Genes, encoded proteins, 
and linear signaling pathways are actually embedded in complex networks of 
genetic and non-genetic components which generally have various positive and 
negative feedback loops and dynamical behavior . We focus here on gene regula-
tion, which is the basis for conceptualizing gene interactions, the fundamental 
property underlying nonlinear, gene regulatory networks . The concept of gene 
regulation itself, which is nothing new, is not consistent with a one-to-one GPM, 
because it implies that the phenotypic effect of one gene function will depend 
on the activity of other genes regulating it . Although explicit awareness of the 
fact that the genes coding for all the proteins in the cell are necessarily regula-
ted by some other regulatory proteins, which are themselves also regulated, 
seems overwhelming; such realization can be succinctly represented in qualita-
tive gene regulatory network (GRN) models . These are becoming very useful to 
follow and understand the concerted action of multiple interacting components .
A common working model in systems biology is that in which the genome is 
mapped directly to a GRN, and the cellular phenotype is represented by the ac-
tivity of each of its genes, its expression pattern . Thus in a genotype-phenotype 
distinction based on GRN dynamics, a network represents effectively the geno-
type of the cell, while its associated expression profile represents its phenotype 
(Davila-Velderrain and Álvarez-Buylla 2014) . The structure of the genome (and 
network) remains virtually constant through development while the cellular 
phenotype changes . Why are phenotypic changes observed through develop-
ment in such robust and reproducible patterns?
The genomic nature of the GRN implies a physically coded structure, by 
means of which the network naturally constrains the permissible temporal be-
havior of the activity of each gene . For example, a specific gene a is regulated 
by a specific set of genes . Given the activity state of these regulators and the 
functional form of the regulation, each time gene a will be channelled to take 
276
Volumen 3 | Número 5 | enero-abril 2015
D
O
S
S
I
E
R
INTERdisciplina
specific future states . This simple regulatory rule applies simultaneously to all 
the genes producing a self-organizing process that would inevitably lead to the 
establishment of only those cellular states (phenotypes), which are logically 
consistent with the underlying regulatory logic . Hence, the GRN imposes con-
straints on the behavior of the cell . The observed robustness and reproducibil-
ity of cell behavior emerges naturally as a self-organizing process . Any source 
of extracellular (non) specific inductive stimulus would inevitably converge to 
one of the phenotypic states which are logically consistent with the underlying 
regulatory logic of the network being considered .
The rationale briefly exposed above has been exploited to propose GRNs 
grounded on experimental data for understanding how cell-fate specification 
occurs during, for example, early flower development (See Mendoza and Álva-
rez-Buylla 1998; Espinosa-Soto et al . 2004; and an update in Sanchez-Corrales 
et al . 2010), and root stem cell patterning (Azpeitia et al . 2010); and it is now 
supported by a wealth of consolidated theoretical and experimental work (see, 
for example Huang et al . 2005; Azpeitia et al . 2014) .
Importantly, in contrast to the assumptions implicit in the one-to-one GPM, 
interactions in the network are fundamental to the establishment of the pheno-
type, and thus the effect of a mutation on the manifested phenotype will be 
 conditional on the network context of the gene under consideration (Davila-Vel-
derrain et al . 2014) . Given that the multitude of observed robust cellular pheno-
typic states would depend on network constraints due to gene regulatory inte-
ractions, the orchestrating role of GRNs effectively constitutes a many-to-many 
(non-linear) GPM, in which most components can, at the same time, constitute 
both causes and effects (Figure 1c) .
Blind, Indifferent or market-oriented Biomedical and 
Biotechnological Research?
Notwithstanding all the evidence produced by almost two decades of post-geno-
mic research, the subliminal presence of the over-simplified one-to-one GPM, 
although most of the time it is not credited, is undeniable . It is implicitly assu-
med as a main goal driving mainstream biomedical research that genes cause, 
for example, cancer; for they cause phenotypes by coding proteins (Huang 
2013) . This is also the case in biotechnological research, where it is acknowled-
ged that a particular gene from one species in which a particular “function” is 
produced, can be readily put into another species expecting the same “function” 
(Vaeck et al . 1987) . Considering that a myriad of studies search for “causal” mu-
tations, apparently this gene-centric assumption is rarely noticed — or, alterna-
tively, it is just ignored . Despite the huge amount of resources invested in 
277
José Dávila-Velderrain and Elena Álvarez-Buylla Linear Causation Schemes in...
D
O
S
S
I
E
R
 genome sequencing projects, such thing as a universal (causal) mutation for a 
degenerative disease has not been successfully identified (Huang 2013) . Never-
theless, having specific molecules as candidate causal factors of particular di-
seases enables companies to develop new drugs for the market . Given the limi-
ted nature of the underlying simplistic one-to-one GPM, this approach is likely 
to fail . It may reproduce only based on its limited effectiveness — and mostly on 
marketing strategies — instead of deep explanations or much needed solutions . 
Importantly, such continuing search for potential molecular targets in therapeu-
tics or single-gene golden bullet solutions to complex agricultural threats evi-
dences the prevalence of the one-to-one GPM, i .e ., by assuming that there is a 
protein for every disease or for any environmental challenge in agriculture . 
The potential for therapy also complicates matters, for it may be a perfectly 
acceptable research goal regardless of its impact on improving understanding 
or on actually proving causation . Thus, it could be the case that biomedical re-
search itself has not naturally evolved to such a naive state; it might be instead 
that the market driven technocentric character of modern “science” happens to 
stimulate the inheritance of old ideas that continue to be convenient — unfortu-
nately for science, though, the rate of increase in conceptual understanding 
seems not to be following the fast-paced technological evolution .
To summarize, the prevailing paradigm implicitly assumes that genes deter-
mine cell behavior through a one-to-one GPM . Specifically, genes code proteins 
which directly determine phenotypes, and consequently, mutations in the genes 
should by themselves alter phenotypes . Therefore, targeting altered proteins 
produced from mutated genes seems to be the best strategy to “correct” a patho-
logical phenotype — the same can be said of epigenetic alterations, altered 
pathways or even networks . However, a multitude of post-genomic evidence 
makes the one-to-one GPM untenable . In contrast, a GPM in terms of the orches-
trating role of molecular regulatory networks, which constitutes a many-to-many 
GPM, naturally explains paradoxical observations and provides a formal fra-
mework for the interpretation of ever-growing post-genomic molecular data . 
Acknowledgements
This work was supported with ERAB grants: Conacyt (Mexico) 180098 and 
180380; and UNAM-DGAPA-PAPIIT: IN203113 .
References
Alberch, P . «From genes to phenotype: dynamical systems and evolvability .» 
 Genetica 84, nº 1 (1991): 5-11 .
278
Volumen 3 | Número 5 | enero-abril 2015
D
O
S
S
I
E
R
INTERdisciplina
Alexander, R . P ., G . Fang, J . Rozowsky, M . Snyder and M . B . Gerstein . «Annotating 
non-coding regions of the genome .» Nature Reviews Genetics 11, nº 8 (2010): 
559-571 .
Azpeitia, E ., J . Davila-Velderrain, C . Villarreal and E . Álvarez-Buylla . «Gene regu-
latory network models for floral organ determination .» Flower Development 
(2014): 441-469 .
———, M . Benítez, I . Vega, C . Villarreal and E . Álvarez-Buylla . «Single-cell and 
coupled GRN models of cell patterning in the Arabidopsis thaliana root stem 
cell niche .» BMC systems biology 4, nº 1 (2010): 134 .
Balázsi, G ., Van Oudenaarden, A . and J . J . Collins . «Cellular decision making and 
biological noise: from microbes to mammals .» Cell 144, nº 6 (2011): 910-
925 .
Brunet, T . D . and W . F . Doolittle . «Getting “function” right .» Proceedings of the 
National Academy of Sciences 111, nº 33 (2014): E3365-E3365 .
Carninci, P ., et al . «The transcriptional landscape of the mammalian genome .» 
Science 309, nº 5740 (2005): 1559-1563 .
Chatr-aryamontri, A ., et al . «The BioGRID interaction database: 2013 update .» 
Nucleic acids research 41 nº D1 (2013): D816-D823 .
Cheng, J ., et al . «Transcriptional maps of 10 human chromosomes at 5-nucleo-
tide resolution .» Science 308, nº 5725 (2005): 1149-1154 .
Costanzo, M ., et al . «The genetic landscape of a cell .» Science 327, nº 5964 
(2010): 425-431 .
Crick, F . H . «Central dogma of molecular biology .» Nature 227, nº 5258 (1970): 
561-563 .
——— . «On protein synthesis .» Symposia of the Society for Experimental Biology 
12 (1958): 138 .
Croft, D ., et al . «Reactome: a database of reactions, pathways and biological pro-
cesses .» Nucleic acids research, (2010): gkq1018 .
Davila-Velderrain, J ., A . Servin-Marquez and E . Álvarez-Buylla . «Molecular evolu-
tion constraints in the floral organ specification gene regulatory network 
module across 18 angiosperm genomes .» Molecular biology and evolution 
31, nº 3 (2014): 560-573 .
——— and E . Álvarez-Buylla . «Bridging genotype and phenotype .» In Frontiers in 
Ecology, Evolution and Complexity, edited by Octavio Miramontes, Alfonso 
Valiente-Banuet and Mariana Benítez . CopIt ArXives, 2014 .
Doolittle, W . F ., T . D . Brunet, S . Linquist and T . R . Gregory . «Distinguishing be-
tween “function” and “effect” in genome biology .» Genome biology and evo-
lution 6, nº 5 (2014): 1234-1237 .
Espinosa-Soto, C ., P . Padilla-Longoria and E . Álvarez-Buylla . «A gene regulatory 
network model for cell-fate determination during Arabidopsis thaliana 
279
José Dávila-Velderrain and Elena Álvarez-Buylla Linear Causation Schemes in...
D
O
S
S
I
E
R
 flower development that is robust and recovers experimental gene expres-
sion profiles .» The Plant Cell Online 16, nº 1 (2004): 2923-2939 .
Hancock, J . M . (Ed .) . Phenomics. CRC Press, 2014 .
Ho, W . C . and J . Zhang . «The Genotype-Phenotype Map of Yeast Complex Traits: 
Basic Parameters and the Role of Natural Selection .» Molecular biology and 
evolution 31, nº 6 (2014): 1568-1580 .
Houle, D ., D . R . Govindaraju and S . Omholt . «Phenomics: the next challenge .» 
Nature Reviews Genetics 11, nº 12 (2010): 855-866 .
Huang, S ., G . Eichler, Y . Bar-Yam and D . E . Ingber . «Cell fate sas high-dimension-
al attractor states of a complex gene regulatory network .» Physical Review 
Letters 94, nº 12 (2005): 128701 .
——— . «Genetic and non-genetic instability in tumor progression: link between 
the fitness landscape and the epigenetic landscape of cancer cells .» Cancer 
and Metastasis Reviews 32, nº 3-4 (2013): 423-448 .
——— . «Rational drug discovery: what can we learn from regulatory networks?» 
Drug discovery today 7, nº 20 (2002): s163-s169 .
——— . «Systems biology of stem cells: three useful perspectives to help over-
come the paradigm of linear pathways . Philosophical Transactions of the 
Royal Society B .» Biological Sciences 366, nº 1575 (2011): 2247-2259 .
——— . «The practical problems of post-genomic biology .» Nature biotechnology 
18, nº 5 (2000): 471-472 .
Huneman, P . Functions: selection and mechanisms. Springer, 2013 .
Kellis, M ., et al . «Defining functional DNA elements in the human genome .» Pro-
ceedings of the National Academy of Sciences 111, nº 17 (2014): 6131-6138 .
Kuhn, T . S . The structure of scientific revolutions. University of Chicago Press, 
2012 [1962] .
Lehner, B . «Molecular mechanisms of epistasis within and between genes .» 
Trends in Genetics 27, nº 8 (2011): 323-331 .
Lewontin, R . «The genotype/phenotype distinction .» In Stanford Encyclopedia 
of Philosophy . 2011 .
Mendoza, L . and E . Álvarez-Buylla . «Dynamics of the genetic regulatory network 
for arabidopsis thaliana flower morphogenesis .» Journal of Theoretical Biol-
ogy 193, nº 2 (1998): 307-319 .
Natoli, G . and J . C . Andrau . «Noncoding transcription at enhancers: general prin-
ciples and functional models .» Annual review of genetics 46 (2012): 1-19 .
O’Malley, M . A . and Y . Boucher . «Paradigm change in evolutionary microbiology .» 
Studies in History and Philosophy of Science Part C: Studies in History and 
Philosophy of Biological and Biomedical Sciences, 2005: 183-208 .
Perkins, T . J . and P . S . Swain . «Strategies for cellular decision‐making .» Molecular 
systems biology 5, nº 1 (2009) .
280
Volumen 3 | Número 5 | enero-abril 2015
D
O
S
S
I
E
R
INTERdisciplina
Phillips, P . C . «Epistasis-the essential role of gene interactions in the structure 
and evolution of genetic systems .» Nature Reviews Genetics 9, nº 11 (2008): 
855-867 .
Pigliucci, M . «Genotype-phenotype mapping and the end of the ‘genes as blue-
print’metaphor .» Philosophical Transactions of the Royal Society B: Biological 
Sciences 365, nº 1540 (2010): 557-566 .
Plosky, Brian S . eRNAs Lure NELF from Paused Polymerases.Molecular Cell. 2014 .
Rose, M . R . and T . H . Oakley . «The new biology: beyond the Modern Synthesis .» 
Biology direct 2, nº 1 (2007): 30 .
Sanchez-Corrales, Y . E ., E . Álvarez-Buylla and L . Mendoza . «The Arabidopsis 
thaliana flower organ specification gene regulatory network determines a 
robust dif-ferentiation process .» Journal of Theoretical Biology 264, nº 3 
(2010): 971-983 .
Schaefer, C . F ., et al . «PID: the pathway interaction database .» Nucleic acids re-
search, 2009: D674-D679 .
Shapiro, J . A . «Revisiting the central dogma in the 21st century .» Annals of the 
New York Academy of Sciences 1178, nº 1 (2009): 6-28 .
Soyer, O . S . (Ed .) . Evolutionary systems biology 751 Spring 2012 .
Stearns, F . W . «One hundred years of pleiotropy: a retrospective .» Genetics 186, 
nº 3 (2010): 767-773 .
Strohman, R . C . «The coming Kuhnian revolution in biology .» Nature biotechnol-
ogy 15, nº 3 (1997): 194-200 .
Stuart A ., Kauffman . The origins of order: Self-organization and selection in evo-
lution. Oxford, UK: Oxford University Press, 1993 .
Tyler, A . L ., F . W . Asselbergs, S . M . Williams and J . H . Moore . «Shadows of com-
plexity: what biological networks reveal about epistasis and pleiotropy .» 
Bioessays 31, nº 2 (2009): 220-227 .
Uhlen, M . and F . Ponten . «Antibody-based proteomics for human tissue profil-
ing .» Molecular & Cellular Proteomics 4, nº 4 (2005): 384-393 .
Vaeck, M ., et al .«Transgenic plants protected from insect attack .» Nature 328 
(1987): 33-37 .
Wagner, G . P . and J . Zhang . «The pleiotropic structure of the genotype-pheno-
type map: the evolvability of complex organisms .» Nature Reviews Genetics 
12, nº 3 (2011): 204-213 .
Wilkins, A . S . «Are there ‘Kuhnian’revolutions in biology?» BioEssays (1996): 695-
696 .
1
Bridging the Genotype and the Phenotype:
Towards An Epigenetic Landscape Approach to
Evolutionary Systems Biology
Davila-Velderrain J1,2,∗, Alvarez-Buylla ER1,2,∗
1 Instituto de Ecoloǵıa, Universidad Nacional Autónoma de México, Cd. Universitaria,
México, D.F. 04510, México
2 Centro de Ciencias de la Complejidad (C3), Universidad Nacional Autónoma de
México, Cd. Universitaria, México, D.F. 04510, México
∗ E-mail: jdjosedavila@gmail.com, eabuylla@gmail.com
Abstract
Understanding the mapping of genotypes into phenotypes is a central challenge of current
biological research. Such mapping, conceptually represents a developmental mechanism
through which phenotypic variation can be generated. Given the nongenetic character
of developmental dynamics, phenotypic variation to a great extent has been neglected
in the study of evolution. What is the relevance of considering this generative process
in the study of evolution? How can we study its evolutionary consequences? Despite an
historical systematic bias towards linear causation schemes in biology; in the post-genomic
era, a systems-view to biology based on nonlinear (network) thinking is increasingly being
adopted. Within this view, evolutionary dynamics can be studied using simple dynamical
models of gene regulatory networks (GRNs). Through the study of GRN dynamics,
genotypes and phenotypes can be unambiguously defined. The orchestrating role of GRNs
constitutes an operational non-linear genotype-phenotype map. Further extension of these
GRN models in order to explore and characterize an associated Epigenetic Landscape
enables the study of the evolutionary consequences of both genetic and non-genetic sources
of phenotypic variation within the same coherent theoretical framework. The merging of
conceptually clear theories, computational/mathematical tools, and molecular/genomic
data into coherent frameworks could be the basis for a transformation of biological research
from mainly a descriptive exercise into a truly mechanistic, explanatory endeavor.
Introduction
The mechanistic understanding of the mapping of genotypes into phenotypes is at the
core of modern biological research. During the lifetime of an individual, a developmental
process unfolds, and the observed phenotypic characteristics are consequently established.
As an example, a given individual may or may not develop a disease. Can we explain
the observed outcome exclusively in terms of genetic differences and an unidirectional,
 on April 18, 2014http://biorxiv.org/Downloaded from 
2
linear relationship between genotype and phenotype? Researchers in biology have mostly
assumed so. Over the last decades, scientists under the guidance of such genetic-causal
assumption have struggled with inconsistent, empirical observations. The biological rele-
vance of the phenotypic variability produced during the developmental process itself, and
not as the consequence of genetic mutations, has only recently started to be acknowl-
edged [1–5].
Understanding the unfolding of the individuals phenotype is the ultimate goal of devel-
opmental biology. Evolutionary biology, on the other hand, is largely concerned with the
heritable phenotypic variation within populations and its change during long time periods,
as well as the eventual emergence of new species. Historically, population-level models
seek to characterize the distribution of genotypic variants over a population, considering
that genetic change is a direct indicator of phenotypic variation. Certain assumptions are
implicit to such reasoning. Are those assumptions justifiable in light of the now avail-
able molecular data and the recently uncovered molecular regulatory mechanisms? What
is the relevance of considering the generative developmental sources of phenotypic vari-
ation in the study of evolution? The aim of this paper is to highlight how a systems
view to biology is starting to give insights into these fundamental questions. The overall
conclusion is clear: an unilateral gene-centric approach is not enough. Evolution and de-
velopment should be integrated through experimentally supported mechanistic dynamical
models [6–13].
In the sections that follow, we first present a brief historical overview of evolutionary
biology and the roots of a systematic bias towards linear causation schemes in biology.
Then, we discuss the assumptions implicit in the so-called neo-Darwinian Synthesis of
Evolutionary Biology – the conventional view of evolution. In the last section, we briefly
describe an emerging research program which aims to go beyond the conventional the-
ory of evolution, focusing on a nonlinear mapping from genotype to phenotype through
the restrictions imposed by the interactions in gene regulatory networks (GRNs) and its
associated epigenetic landscape (EL). Overall, this contribution attempts to outline how
the orchestrating role of GRNs during developmental dynamics imposes restrictions and
enables generative properties that shape phenotypic variation.
Darwin’s Legacy
Darwin eliminated the need for supernatural explanations for the origin and adaptations
of organisms when he put evolution firmly on natural grounds [14]. In the mid-19th
century, Darwin published his theory of natural selection [15]. He proposed a natural
process, the gradual accumulation of variations sorted out by natural selection, as an
explanation for the shaping and diversity of organisms. This insight was what put the
study of evolution within the realms of science in the first place [14]. Although it has
had its ups and downs [16], the Darwinian research tradition predominates in modern
evolutionary biology. Much of its success is due to a new (gene-centric) interpretation,
the so-called neo-Darwinian modern synthesis [17]: the merging of mendelian genetics and
Darwin’s theory of natural selection due to prominent early 20th century statisticians. In
this framework, development was left outside, and evolution is seen as a change in the
genotypic constitution of a population over time. Genes map directly into phenotypes (see
 on April 18, 2014http://biorxiv.org/Downloaded from 
3
Figure 1a), implicitly assuming that genetic mutation is the prime cause of phenotypic
variation. Observed traits are generally assumed to be the result of adaptation, the
process whereby differential fitness (the product of the probability of reproduction and
survival) due to genetic variation in a particular environment, leads to individuals better
able to live in such an environment.
From Natural Selection to Natural Variation
Natural selection - a force emanating from outside the organism itself - is the conceptual
core of the Darwinian research tradition. Conceptually, the general process is as follows.
Random mutations occur during reproduction; these mutations are responsible for gen-
erating different (genetic) types of individuals. The selection process then results from
the fact that each type has certain survival probability and/or is able to achieve certain
reproductive performance given the environment. Through this differential rate, some
types are maintained while others are dismissed. It is said that, in this way, selection
makes a “choice” [18]. From a wider perspective, it is generally accepted that selection
is a generic process not restricted to biological evolution [19]. Any error-prone commu-
nication process in which information is consequently transmitted at different rates leads
itself to a selection mechanism. However, despite the appealing conceptual clarity of the
selection mechanism, it is not generally appreciated that the complexity inherent to bi-
ological systems hinders the mechanistic understanding of biological evolution. Because
the reproductive performance of a given type of variant is, mainly, a function of its phe-
notype; the paradigmatic selection process described above is plausible when one assumes
a straightforward causation of phenotype by genotype [10]. A more faithful model of
biological evolution should explicitly consider a genotype-phenotype (GP) map [20,21], a
developmental mechanism which specifies how phenotypic variation is generated (Figure
1b). The generated variation is then what triggers selection [22]. Importantly, a devia-
tion from a linear causation view of development would potentially impact the rate and
direction of evolution [8, 23, 24].
Although not always discussed, Darwin himself devoted much more attention to vari-
ation than to natural selection, presumably because he knew that a satisfactory theory of
evolutionary change requires the elucidation of the causes and properties of variation [25].
After all, natural selection would be meaningless without variation. Ironically, given the
success of the neo-Darwinian framework, phenotypic variation to a great extent has been
neglected in the study of evolution [26]. The mechanistic understanding of the sources of
phenotypic variation constitutes a fundamental gap in conventional evolutionary theory.
Neither Darwin, nor the founders of the neo-Darwinian modern synthesis were able to
address this problem given the biological knowledge available at the time. Moreover, de-
viations from the basic assumptions of the conventional theory were not always generally
appreciated [27].
Implicit Assumptions in Evolution
Being the development of science an evolutionary process itself, it is reasonable to expect
that social-historical contingency has profoundly biased the pathways of scientific inquiry.
This seems to be the case in the history of biology. For example, (1) Darwin’s war against
 on April 18, 2014http://biorxiv.org/Downloaded from 
4
divine explanations for biological complexity caused within the scientific community an
automatic rejection for any goal-oriented activity within organisms. This situation favored
the adoption of the the idea of random (uniform) variation [28, 29]. (2) The mainstream
focus of neo-Dawinism on optimizing reproductive success (fitness) by natural selection
of random variants; on the other hand, implicitly neglected the relevance of gene in-
teractions (see Figure 1a) [30]. Finally, (3) the establishment of the central dogma of
molecular biology (gene → mRNA → protein) further cemented a linear, unidirectional
scheme of causation of molecular traits (one gene - one protein, one trait) [10]. These
events are thought to be associated with a deeply rooted systematic bias towards linear
causation schemes in biology [10, 31]. They also favored the adoption of three major
implicit assumptions upon which the neo-Darwinian tradition was developed, namely:
(1) mutational events occur randomly (e.g. unstructured) along the genome; (2) given
that the phenotypic effects of successive mutations in evolution are of additive nature,
gene interactions and their phenotypic influence can be, to a large extent, ignored; and
(3) the phenotypic distribution of mutational effects mirrors the genetic distribution of
mutations [30].
Scientists are now re-examining the most basic assumptions about evolution in light
of post-genomic, systems biology [28, 32]. Compelling evidence has been presented even
against assumption (1) above. For example, Shapiro has shown how a truly random
(unstructured) nature of mutational events is empirically unsustainable. He has coined
the term “natural genetic engineering”, referring to the known operators that produce
genomic changes and which are subjected to cellular regulatory regimes of epigenetic
character [29]. It seems that the generative properties of genetic variation are nonuniform,
and thus, biased as well. Assumptions (2) and (3) above are, instead, mainly concerned
with how phenotypic variation is generated given a genetic background; or in other words,
with the mechanistic understanding of the GP map. Here, we are concerned with this
developmental process and its evolutionary relevance.
From Genes to Networks
At the beginning of the 21th century, biology confronted an uncomfortable fact: despite
the increasing availability of whole-genome sequence data, it was not possible to predict,
or even clarify, phenotypic observations. In fact, we now know that there is not sufficient
information in the linear DNA sequences of the complete genomes to recover and/or
understand the diverse phenotypic states of an organism. It was clear that cell behavior
was much more complex than anticipated. Since then, biological research has increasingly
been oriented towards a systems-level approach that goes beyond obtaining and describing
large data sets at the genomic, transcriptomic, proteomic or metabolomic levels. An
assumption of such systems approach to biology is that cell behavior can be understood in
terms of the dynamical properties of the involved molecular regulatory networks. Modern
molecular evolutionary studies are starting to incorporate this network thinking: genes are
not individual entities upon which evolutionary forces act independently. Evolutionary
forces, functional constraints, and molecular interactions are conditionally dependent on
the systems level [33]. How a systems-view impacts our understanding of the GP map?
 on April 18, 2014http://biorxiv.org/Downloaded from 
5
Genetic Variants
a) Genotypes Phenotype
...
...
...
Gene1
Selection
Reproduction
&
Mutation
GP map ?
b)
Genotype
Phenotypes
Nongenetic
Heterogeneity
Development Development
Generation 1
Development
Generation 2 Generation n
...
...
...
...
...
...
Figure 1. a) A straightforward genotype-phenotype relationship: the genetic distribution of
the observed locus would completely mirror the phenotypic distribution; gene interactions are
ignored; as a result, three different genotypes would correspond to the same phenotype given
the locus under observation. b) A developmental process from genotype to phenotype, a GP
map: through the development of an individual nongenetic phenotypic variation is generated
each generation; in an evolutionary time-scale, evolution operations (blue) produce genetic
variation. Selection acts on phenotypes; phenotypic variation is the product of both genetic
mutational operations and epigenetic developmental processes.
 on April 18, 2014http://biorxiv.org/Downloaded from 
6
Fundamental Sources of Natural Variation
Although the concepts of genotype and phenotype are fundamental to evolution, it is
not straightforward to operationally define them: In practice genotype and phenotype
distinctions are just partial [34]. This is partially the reason why simple theoretical
models are so important for the epistemology of evolution. A common working model
in systems biology is that in which the phenotypic state is defined at the cellular level.
The cellular phenotype is represented by the activity of each of its genes, its expression
pattern. Since the regulatory interactions among the genes within the cell constitute a
network, the network effectively represents the genotype of the cell, while its associated
expression profile represents its phenotype (Figure 2). The structure of the former derives
directly from the genome, while the latter changes through development. In practice, we
just observe certain expression patterns (e.g cell-types) - with small deviations - and not
others. Why is that?
GRN developmental dynamics generates phenotypic nongenetic (epigenetic)
heterogeneity
When thinking in terms of a genotype-phenotype distinction based on GRN dynamics,
it is natural to consider an abstract space where all the virtually possible phenotypes
reside. We call this space the state-space. Empirical observations suggest that something
should be maintaining cells within specific, restricted regions of this space. The structured
nature of the underlying GRN determines a trajectory in this state-space: given the state
of the genes regulating a gene i, and the functional form of the regulation, the gene i
is canalized to take specific future states. Eventually, this self-organizing process would
inevitably lead to the establishment of those states which are logically consistent with the
underlying regulatory logic. In this way, the GRN imposes constraints to the behavior of
the cell. The resultant states are denominated attractors and correspond to observable
cell-types. These are the basis of the well developed dynamical-systems theory of cell
biology (for a review, see [35, 36]). This theory was first applied to propose a GRN
grounded on experimental data for understanding how cell-fate specification occurs during
early flower development (see, [37, 38] and update in [39]). Originally, the approach was
inspired by theoretical work in randomly assembled networks by Stuart Kaufman [40]. In
the last decades, the theory has been supported by a wealth of consolidated theoretical
and experimental work (see, for example [7, 13, 41]).
Through GRN dynamics, development generates cellular phenotypes. The general
acceptance of this generative role necessarily implies deviations from the neo-Dawinian
framework. Importantly, (1) the effect of a perturbation (mutational or otherwise) on the
manifested phenotype is not uniformly distributed (truly random) across all the genes in
the network, and (2) the interactions in the network are fundamental to the establishment
of the phenotype. The orchestrating role of GRNs constitutes a non-linear GP map:
phenotypic variation does not scale proportionally to genotypic variation; it is not linear
(Figure 2). Two important consequences of these mechanistic view of developmental
dynamics have been eloquently pointed out recently. First, the nonlinear character of
this mapping ensures that the exact same genotype (network) is able to produce several
phenotypes (attractors) [40]. Second, given that molecular regulatory events are stochastic
in nature, a cell is able to explore the state-space by both attracting and dispersing forces -
 on April 18, 2014http://biorxiv.org/Downloaded from 
7
Genotypes Phenotypes
...
...
...
G1 G2 G3
P1 P2
P3
G = Gene
P = Phenotype
...G1 G2 G3
P1
G1 G2 G3
P3
...
G1 G2 G3
P2
...
Figure 2. The orchestrating role of GRNs constitutes a non-linear GP map. Through the
restrictions imposed by the interactions in GRNs, cellular phenotypes (represented by
expression profiles) are generated. Due to the nonlinear character of GRN dynamics, the GP
map is one-to-many. The effect of mutations in the phenotype is not uniformly distributed
over the genes, but depends on the interactions: mutations can or cannot result in different
phenotypes depending on the genetic background and the location of the affected genes in the
network.
forces that slightly deviate the dynamics from the determined trajectory. Any phenotype
of a cellular population at any given time is statistically distributed [10]. These sources
of variation are the natural product of developmental dynamics. Consequently, at any
given time, a population can manifest phenotypic variation that is relevant to evolution
(heritable) in the absence of genetic variation. How can we study evolution without
ignoring the fundamental role of developmental dynamics?
Evolutionary Systems Biology Approaches
A systems view to evolutionary biology, in which network models as GP mappings are
considered explicitly, is under development (see, for example [9, 11, 42]). Within this
general framework, several specific approaches are proposed in order to study the evo-
lutionary consequences of considering developmental sources of phenotypic variation. In
this section, we briefly present a preview of an emerging complementary approach.
Epigenetic(Attractors) Landscape Evolution
In 1950s, C.H. Waddington proposed the conceptual model of the epigenetic landscape
(EL), a visionary attempt to synthesize a framework that would enable an intuitive dis-
cussion about the relationship between genetics, development, and evolution [43]. His
reasoning was based on the consideration of a fact: the physical realization of the informa-
 on April 18, 2014http://biorxiv.org/Downloaded from 
8
tion coded in the genes - and their interactions - imposes developmental constraints while
forming an organism. Now, in the post-genomic era, a formal basis for this metaphorical
EL is being developed in the context of GRNs [10,44,45]. The key for this formalization
is an emergent ordered structure embedded in the state-space, the attractors landscape
(AL). As well as generating the cellular phenotypic sates (attractors), the GRN dynamics
also partitions the whole state-space in specific regions and restricts the trajectories from
one state to another one. Each region groups the cellular states that would eventually
end up in a single, specific attractor. These sub-spaces are denominated the attractor’s
basin of attraction. Given this (second) generative property of GRN dynamics, the for-
malization of the EL in this context is conceptually straightforward: the number, depth,
width, and relative position of these basins would correspond to the hills and valleys of
the metaphorical EL. We refer to this structured order of the basins in state-space as
the AL (see Figure 3). The characterization of an AL would correspond, in practical
terms, to the characterization of an EL. Is this formalized EL useful for the mechanistic
understanding of phenotype generation?
Multicellular morphogenetic processes unfold naturally in the EL
The structured EL is a generative property of the GRN dynamics, but at the same time, it
also constrains the behavior of a developing system. While a developing system is following
its dynamically constrained trajectory in state-space, developmental perturbations from
internal or external origin can deviate it. In a cellular population, then, the probability
of one phenotypic transition or another during development, as well as the stationary
distribution of phenotypes, would be conditioned on both the localization of the individual
cells in the EL and on the landscape’s structure. As a general result of this interplay,
determinism and stochasticity are reconciled, and robust morphogenetic patterns can be
established by a hierarchy of cellular phenotypic transitions (see, for example [44,45]). In
this way, morphogenetic processes effectively unfold on ELs. How could this theoretical
framework improve the understanding of evolutionary dynamics?
We have an effective nonlinear GP map from GRN to EL. Given an experimentally
characterized GRN, the EL associated to real, specific developmental processes can be
analyzed ( [13,44,45]). Both cellular phenotypes (attractors) and morphogenetic patterns
are linked to the structure of the EL. Can we describe this structure quantitatively? How
robust is the structure to genetic (network) mutation? Can we describe quantitatively
the change in structure in response to both mutational and developmental perturbations?
How slower is this rate of change in comparison to the time-scale of developmental dynam-
ics (landscape explorations)? What are the phenotypic consequences of different relative
rates of change? Does the resultant evolutionary trajectory of the reshaped EL struc-
ture subjected to mutations predicts the probability of phenotypic change (innovation) -
based, for example, in the appearance of new cellular phenotypes or morphogenetic pat-
terns? (Figure 3). Insight into these and similar questions could enhance the mechanistic
understanding of the evolution of morphogenetic processes.
 on April 18, 2014http://biorxiv.org/Downloaded from 
9
x(t+ δt) = F (x(t) u δt), ,
a)
GRN
state-space
AL
G1 G2 G3...
cell state
G1 G2 G3...
G1 G2 G3...
EL
b)
G1 G2 G3...
mutation
attractor
Figure 3. The Epigenetic (Attractors) Landscape. a) Through a dynamical mapping - a
mathematical representation of the gene regulatory logic - GRNs generate both the cellular
phenotypes (attractors) and the ordered structure of the state space - the AL. Through the
structure of the AL, the EL is formalized in the context of GRNs. b) The number, depth,
width, and relative position of attractors correspond to the hills and valleys of the EL. The
topography of the landscape can change in response to perturbations. Mutations could
eventually reshape the EL and consequently eliminate and/or generate novel phenotypes.
 on April 18, 2014http://biorxiv.org/Downloaded from 
10
Conclusion and Challenges
A modern systems view to biology enables tackling foundational questions in evolution-
ary biology from new angles and with unprecedented molecular empirical support. Little
is known about the mechanistic sources of phenotypic variation and its impact to evo-
lutionary dynamics. The explicit consideration of these processes in evolutionary mod-
els directly impacts our thinking about evolution. Simple, generic dynamical models of
GRNs, where genotypes and phenotypes can be unambiguously defined, are well-suited
to rigorously explore the problem. Further extension of these models in order to explore
and characterize the associated EL enables the study of the evolutionary consequences of
both genetic and non-genetic sources of phenotypic variation within the same coherent
theoretical framework. The network-EL approach to evolutionary dynamics is promising,
as it directly manifests the multipotency associated with a given genotype. Although con-
ceptually clear and well-founded, its practical implementation implies several difficulties,
nonetheless; specially in the case of high dimensional systems. Work has been done in
which the landscape associated with a specific, experimentally characterized GRN is de-
scribed quantitatively in terms of robustness and state transition rates [46], for example.
However, neither the methodology to derive ELs from GRNs, nor the quantitative descrip-
tion of ELs are standard procedures. Most approaches require approximations and are
technically challenging for the case of networks with more than 2 nodes. Further research
in the quantitative description of experimentally grounded GRNs is still needed in order
to explore the constraints and the plasticity of ELs associated with a genotypic (network)
space. In this regard, discrete dynamical models are promising tools for the exhaustive
characterization of the EL, and for the study of multicellular development [45]. A second
major challenge is the generalization of GRN dynamical models in order to include addi-
tional sources of constraint during development. Tissue-level patterning mechanisms such
as cell-cell interactions; chemical signaling; cellular growth, proliferation, and senescence;
inevitably impose physical limitations in terms of mechanical forces which in turn affect
cellular behavior. Although some progress has been presented in this direction [47, 48],
the problem certainly remains open.
The post-genomic era of biology is starting to show that old metaphors such as
Waddington’s EL are not just frameworks for the conceptual discussion of complex prob-
lems. The merging of conceptually clear theories, computational/mathematical tools, and
molecular/genomic data into coherent frameworks could be the basis for a much needed
transformation of biological research from mainly a descriptive exercise into a truly mech-
anistic, explanatory and predictive endeavor - EL models associated with GRNs being a
salient example.
References
1. Feinberg AP, Irizarry RA (2010) Stochastic epigenetic variation as a driving force
of development, evolutionary adaptation, and disease. Proceedings of the National
Academy of Sciences 107: 1757–1764.
 on April 18, 2014http://biorxiv.org/Downloaded from 
11
2. Frank SA, Rosner MR (2012) Nonheritable cellular variability accelerates the evo-
lutionary processes of cancer. PLoS biology 10: e1001296.
3. Freund J, Brandmaier AM, Lewejohann L, Kirste I, Kritzler M, et al. (2013) Emer-
gence of individuality in genetically identical mice. Science 340: 756–759.
4. Huang S (2009) Non-genetic heterogeneity of cells in development: more than just
noise. Development 136: 3853–3862.
5. Pisco AO, Brock A, Zhou J, Moor A, Mojtahedi M, et al. (2013) Non-darwinian
dynamics in therapy-induced cancer drug resistance. Nature communications 4.
6. Alvarez-Buylla ER, Azpeitia E, Barrio R, Beńıtez M, Padilla-Longoria P (2010)
From abc genes to regulatory networks, epigenetic landscapes and flower morpho-
genesis: making biological sense of theoretical approaches. Seminars in cell &
developmental biology 21: 108–117.
7. Jaeger J, Crombach A (2012) Lifes attractors. In: Evolutionary Systems Biology,
Springer. pp. 93–119.
8. Jaeger J, Irons D, Monk N (2012) The inheritance of process: a dynamical systems
approach. Journal of Experimental Zoology Part B: Molecular and Developmental
Evolution 318: 591–612.
9. Wagner A (2011) The origins of evolutionary innovations. Oxford University Press,
Oxford.
10. Huang S (2012) The molecular and mathematical basis of waddington’s epigenetic
landscape: A framework for post-darwinian biology? Bioessays 34: 149–157.
11. Soyer OS (2012) Evolutionary systems biology, volume 751. Springer.
12. Beńıtez M, Azpeitia E, Alvarez-Buylla ER (2013) Dynamic models of epidermal
patterning as an approach to plant eco-evo-devo. Current opinion in plant biology
16: 11–18.
13. Azpeitia E, Davila-Velderrain J, Villarreal C, Alvarez-Buylla ER (2014) Gene reg-
ulatory network models for floral organ determination. In: Flower Development,
Springer. pp. 441–469.
14. Ayala FJ (2007) Darwin’s greatest discovery: design without designer. Proceedings
of the National Academy of Sciences 104: 8567–8573.
15. Darwin C (1859) On the origins of species by means of natural selection. London:
Murray.
16. Depew DJ, Weber BH (1995) Darwinism evolving: Systems dynamics and the
genealogy of natural selection. Bradford Books/MIT Press.
17. Huxley J, et al. (1942) Evolution. the modern synthesis. Evolution The Modern
Synthesis .
 on April 18, 2014http://biorxiv.org/Downloaded from 
12
18. Nowak MA (2006) Evolutionary dynamics: exploring the equations of life. Harvard
University Press.
19. Schuster P (2008) Boltzmann and evolution: some basic questions of biology seen
with atomistic glasses. Boltzmanns Legacy : 217–241.
20. Lewontin RC (1974) The genetic basis of evolutionary change, volume 560.
Columbia University Press New York.
21. Alberch P (1991) From genes to phenotype: dynamical systems and evolvability.
Genetica 84: 5–11.
22. Schaper S, Louis A (2014) The arrival of the frequent: How bias in genotype-
phenotype maps can steer populations to local optima. PloS one 9: e86635.
23. Gould SJ, Lewontin RC (1979) The spandrels of san marco and the panglossian
paradigm: a critique of the adaptationist programme. Proceedings of the Royal
Society of London Series B Biological Sciences 205: 581–598.
24. Alvarez-Buylla E, Beńıtez M, Espinosa-Soto C, et al. (2007) Phenotypic evolution
is restrained by complex developmental processes. HFSP journal 1: 99–103.
25. Gould SJ (1983) Hen’s teeth and horse’s toes. WW Norton & Company.
26. Hallgŕımsson B, Hall BK (2011) Variation: a central concept in biology. Academic
Press.
27. Reid RG (2007) Biological emergences: Evolution by natural experiment. MIT
Press.
28. Shapiro JA (2011) Evolution: a view from the 21st century. Pearson Education.
29. Shapiro JA (2012) Rethinking the (im) possible in evolution. Progress in Biophysics
and Molecular Biology .
30. Wilkins AS (2008) Waddington’s unfinished critique of neo-darwinian genetics:
Then and now. Biological Theory 3: 224–232.
31. Huang S (2011) Systems biology of stem cells: three useful perspectives to help
overcome the paradigm of linear pathways. Philosophical Transactions of the Royal
Society B: Biological Sciences 366: 2247–2259.
32. Koonin EV (2011) The logic of chance: the nature and origin of biological evolution.
FT press.
33. Davila-Velderrain J, Servin-Marquez A, Alvarez-Buylla ER (2013) Molecular evo-
lution constraints in the floral organ specification gene regulatory network module
across 18 angiosperm genomes. Molecular biology and evolution : mst223.
34. Lewontin R (2011) The genotype/phenotype distinction. Stanford Encyclopedia of
Philosophy.
 on April 18, 2014http://biorxiv.org/Downloaded from 
13
35. Huang S, Kauffman S (2009) Complex gene regulatory networks-from structure to
biological observables: cell fate determination. Encyclopedia of Complexity and
Systems Science Meyers RA, editors Springer : 1180–1293.
36. Kaneko K (2011) Characterization of stem cells and cancer cells on the basis of
gene expression profile stability, plasticity, and robustness. Bioessays 33: 403–413.
37. Mendoza L, Alvarez-Buylla ER (1998) Dynamics of the genetic regulatory network
for¡ i¿ arabidopsis thaliana¡/i¿ flower morphogenesis. Journal of theoretical biology
193: 307–319.
38. Espinosa-Soto C, Padilla-Longoria P, Alvarez-Buylla ER (2004) A gene regulatory
network model for cell-fate determination during arabidopsis thaliana flower de-
velopment that is robust and recovers experimental gene expression profiles. The
Plant Cell Online 16: 2923–2939.
39. Sanchez-Corrales YE, Alvarez-Buylla ER, Mendoza L (2010) The arabidopsis
thaliana flower organ specification gene regulatory network determines a robust
differentiation process. Journal of theoretical biology 264: 971–983.
40. Kauffman SA (1993) The origins of order: Self-organization and selection in evolu-
tion. Oxford university press.
41. Huang S, Eichler G, Bar-Yam Y, Ingber DE (2005) Cell fates as high-dimensional
attractor states of a complex gene regulatory network. Physical review letters 94:
128701.
42. Cotterell J, Sharpe J (2013) Mechanistic explanations for restricted evolutionary
paths that emerge from gene regulatory networks. PloS one 8: e61178.
43. Waddington CH (1957) The strategy of genes. London: George Allen & Unwin,
Ltd.
44. Álvarez-Buylla ER, Chaos Á, Aldana M, Beńıtez M, Cortes-Poza Y, et al. (2008)
Floral morphogenesis: stochastic explorations of a gene network epigenetic land-
scape. Plos one 3: e3626.
45. Zhou JX, Qiu X, dHerouel AF, Huang S (2014) Discrete gene network models for
understanding multicellularity and cell reprogramming: From network structure
to attractor landscapes landscape. In: Computational Systems Biology Second
Edition Elsevier : 241–276.
46. Li C, Wang J (2013) Quantifying waddington landscapes and paths of non-adiabatic
cell fate decisions for differentiation, reprogramming and transdifferentiation. Jour-
nal of The Royal Society Interface 10: 20130787.
47. Barrio RÁ, Hernandez-Machado A, Varea C, Romero-Arias JR, Alvarez-Buylla E
(2010) Flower development as an interplay between dynamical physical fields and
genetic networks. PloS one 5: e13523.
 on April 18, 2014http://biorxiv.org/Downloaded from 
14
48. Barrio RA, Romero-Arias JR, Noguez MA, Azpeitia E, Ortiz-Gutiérrez E, et al.
(2013) Cell patterns emerge from coupled chemical and physical fields with cell
proliferation dynamics: the arabidopsis thaliana root as a study system. PLoS
computational biology 9: e1003026.
 on April 18, 2014http://biorxiv.org/Downloaded from 
Chapter 3
Metodoloǵıa
A Description of Phenomena is Not Equivalent to an Understanding
... an understanding of some phenomenon is not obtained by constructing and
adjusting a set of equations in such a manner that it provides an accurate model.
It is much more meaningful scientifically to seek the construction of simple models and
endeavor to derive an understanding from these than to attempt to mimic every detail
of each specific system we encounter
— Kunihiko Kaneko, Life: An Introduction to Complex Systems Biology (2006)
39
441
José Luis Riechmann and Frank Wellmer (eds.), Flower Development: Methods and Protocols, Methods in Molecular Biology,  
vol. 1110, DOI 10.1007/978-1-4614-9408-9_26, © Springer Science+Business Media New York 2014
Chapter 26
Gene Regulatory Network Models for Floral Organ 
Determination
Eugenio Azpeitia, José Davila-Velderrain, Carlos Villarreal,  
and Elena R. Alvarez-Buylla
Abstract
Understanding how genotypes map unto phenotypes implies an integrative understanding of the processes 
regulating cell differentiation and morphogenesis, which comprise development. Such a task requires the 
use of theoretical and computational approaches to integrate and follow the concerted action of multiple 
genetic and nongenetic components that hold highly nonlinear interactions. Gene regulatory network 
(GRN) models have been proposed to approach such task. GRN models have become very useful to 
understand how such types of interactions restrict the multi-gene expression patterns that characterize dif-
ferent cell-fates. More recently, such temporal single-cell models have been extended to recover the tem-
poral and spatial components of morphogenesis. Since the complete genomic GRN is still unknown and 
intractable for any organism, and some clear developmental modules have been identified, we focus here 
on the analysis of well-curated and experimentally grounded small GRN modules. One of the first experi-
mentally grounded GRN that was proposed and validated corresponds to the regulatory module involved 
in floral organ determination. In this chapter we use this GRN as an example of the methodologies involved 
in: (1) formalizing and integrating molecular genetic data into the logical functions (Boolean functions) 
that rule gene interactions and dynamics in a Boolean GRN; (2) the algorithms and computational 
approaches used to recover the steady-states that correspond to each cell type, as well as the set of initial 
GRN configurations that lead to each one of such states (i.e., basins of attraction); (3) the approaches used 
to validate a GRN model using wild type and mutant or overexpression data, or to test the robustness of 
the GRN being proposed; (4) some of the methods that have been used to incorporate random fluctua-
tions in the GRN Boolean functions and enable stochastic GRN models to address the temporal sequence 
with which gene configurations and cell fates are attained; (5) the methodologies used to approximate 
discrete Boolean GRN to continuous systems and their use in further dynamic analyses. The methodolo-
gies explained for the GRN of floral organ determination developed here in detail can be applied to any 
other functional developmental module.
Key words Gene regulatory networks, Functional module, Flower development, Cell differentiation, 
Attractors, Morphogenesis, Dynamics, Floral organ determination, Attractors, Basins of attraction, 
Stochastic networks, Mathematical models, Computational simulations, Robustness
Eugenio Azpeitia, José Davila-Velderrain, and Carlos Villarreal contributed equally to this work.
442
1  Introduction
The mapping of the genotype unto the phenotypes implies the 
concerted action of multiple components during cell differentiation 
and morphogenesis that comprise development [1]. These compo-
nents are part of regulatory motifs, which hold nonlinear interac-
tions that produce complex behaviors [2, 3]. Such complexity 
cannot be understood in terms of individual components, and 
rather emerges as a result of the interactions among the compo-
nents of the whole system. In order to integrate the action of 
multiple molecular components and follow their dynamics, it is 
indispensable to postulate mathematical and computational models. 
Gene regulatory network (GRN) models have appeared as one of 
the most powerful tools for the study of complex molecular sys-
tems. Small GRNs can sometimes be studied with analytical mathe-
matical formulations, while medium or large size GRNs are amenable 
for dynamical analyses only with computer simulations [4]. As fol-
lowing the dynamics of the genomic interactomes is still intractable 
even with the most powerful computers, and given the fact that 
genomic networks are composed of multiple structural and func-
tional modules, others and we have proposed to search for such 
modules for the study of biomolecular systems dynamics using 
GRN models (e.g., [5–7]).
Boolean models are probably the simplest type of formalism 
employed for the study of GRNs. Nonetheless, Boolean models 
provide meaningful information about the system. Importantly, 
Boolean GRNs can be approximated to continuous models that 
enable the use of additional mathematical tools [4, 8]. Given that: 
(a) the logic of GRNs is adequately formalized with Boolean mod-
els; (b) obtaining real biological parameters from biological molec-
ular systems is still a complicated task; and (c) the use of realistic 
models can be computationally expensive, we believe that Boolean 
models and their continuous approximations are becoming a fun-
damental and practical tool to study GRN dynamics and to under-
stand the complex behaviors observed in developmental processes 
(see refs. 9–11).
Based on the above rationale, the first step in building a GRN 
model is the identification of a developmental module and the 
integration of all the experimental data on the molecular compo-
nents participating in it. The ABC genetic model of floral organ 
determination (see refs. 3, 12) (see Chapter 1) is part of a clearly 
circumscribed developmental module that underlies the sub- 
differentiation of the floral meristem in four concentric rings early 
on during flower development. From the outermost part of the 
floral meristem to its center, each ring comprises the primordial 
cells of sepals, petals, stamens, and carpels. Based on experimental 
Eugenio Azpeitia et al.
443
evidence [13], it became obvious that although necessary, the ABC 
genes are not sufficient to specify floral organs. The ABC model 
has been instrumental to understanding flower development and 
evolution. However, it does not constitute a dynamic model able 
to recover the ABC combinatory code, as well as explain how the 
expression profiles of the set of molecular components included in 
the flower organ determination GRN, which includes the ABC 
genes, is established to promote the sepal, petal, stamen and carpel 
cell fates. Importantly, such a dynamic GRN model is the basis to 
understand how such cell types are determined in time and space, 
and thus, how the morphogenetic pattern that characterizes young 
floral meristems will form adult flowers [12, 14].
In order to uncover the necessary and sufficient set of interact-
ing components involved in floral organ specification, the first step 
implies recovering the experimental evidence of ABC gene interact-
ing components that include both regulated and regulator genes. 
In the case of Boolean models, the experimental data is formalized 
in the form of Boolean functions, which determine the dynamics of 
the GRN. In Boolean or any other type of discrete network, it is 
possible to fully explore the whole set of configurations or states of 
the system, and find the steady state configurations (attractors; see 
below). Kauffman postulated that the attractors to which GRNs 
converge, could correspond to the states characterizing differenti-
ated cells [15]. More recently, Boolean GRNs have been grounded 
on experimental data ([5]; see review in ref. 3) showing that the 
attractors of developmental networks indeed correspond to the 
stable gene configuration observed in different types of cells, 
as long as a sufficient set of components involved in a given devel-
opmental module are incorporated.
In this Chapter we focus on the regulatory module underlying 
floral organ determination in Arabidopsis thaliana during early 
stages of flower development. Some of the methodologies explained 
here have been used in previous publications on such GRN [5, 7, 
16–19]. In this chapter we will use examples extracted mainly from 
our own studies to explain how to develop and extend experimen-
tally supported Boolean GRN models. Then, we explain how to 
incorporate stochastic properties in the model, which can allow us 
to explore the temporal sequence with which attractors or cell gene 
configurations and cell-fates are attained (e.g., [4]). Finally, we 
explain how we can approximate the Boolean model to a continu-
ous one that can then be used in other types of models, for example, 
to explore spatial aspects of morphogenesis [14]. It is important to 
keep in mind that the tools presented in this Chapter can be applied 
to any GRN. Consequently, we begin with general explanations and 
afterwards we use examples from the literature to illustrate each 
methodological step.
Floral Gene Regulatory Network Models
444
2  Methods
GRN nodes and edges: In GRNs, nodes represent genes, proteins 
or other types of molecular components such as miRNAs and 
hormones, while edges represent regulatory interactions among 
the components. Usually the interactions are positive (activations) 
or negative (inhibitions), but other type of interactions can be 
included (e.g., protein-protein interactions).
Variables: Variables are the elements that describe the system under 
study (usually the nodes) and which can take different values at 
each time.
Variable/Gene state: The value that a node takes at a certain time 
represents its state. The state can be a discrete or continuous value. 
In the case of Boolean networks the states can only be “0” when 
“OFF” and “1” when “ON.”
Network State/Configuration: The vector composed by a set of 
values, where each value corresponds to the state of a specific gene 
of the network. In a Boolean network such vectors or network 
configurations are arrays of “0’s” and “1’s.”
Attractors: Stationary network configurations are known as attrac-
tors. Single-state, stationary configurations are known as fixed-
point attractors (Fig. 1a) and these are generally the ones that 
correspond to the arrays of gene activation states that  characterize 
2.1  Definitions
Fig. 1 Fixed-point attractors, cyclic attractors, and transitory states. (a) An example of a fixed-point attractor. 
As observed, fixed-point attractors have one unique state where they stay indefinitely unless something per-
turbs them. (b) An example of a cyclic attractor. Cyclic attractors are composed of two or more network states 
that orderly repeat. In this case we observe a two state cyclic attractor. (c) Transitory states. Transitory states 
are states that lead to an attractor, but are not attractors themselves
Eugenio Azpeitia et al.
445
different cell types. Whereas a set of network states that orderly 
repeat cyclically correspond to cyclic attractors (Fig. 1b).
Transitory states: All states that are not or do not form part of an 
attractor are transient or transitory states (Fig. 1c).
Basin of attraction: The set of all the initial configurations that 
eventually lead to a particular attractor constitute its basin of 
attraction.
Expected or observed attractors: Gene expression profiles or config-
urations that have been obtained from experimental assays and 
reported in the scientific literature for particular cell types are 
referred to here as the expected or observed attractors. Such attrac-
tors are expected to be recovered by the postulated GRN (Fig. 2).
Model Validation: The task of evaluating a model by means of con-
trasting its predictions with experimental results. For Boolean 
GRNs, model validation would imply, among others: recovering 
the observed gene configurations for the cells under study under 
wt and mutant or overexpression conditions, robustness analyses, 
etc. (see below).
Robustness: The ability of a system to maintain an output in the face 
of perturbations. For the case of a Boolean GRN model, it is evalu-
ated, for example, by assessing if the system’s attractors are still 
recovered under different transient and permanent mutations 
(alterations in the Boolean functions, nodes, or GRN topology).
Fig. 2 The set of expected attractors. As explained in the main text, the set of 
expected attractors is obtained from the experimental information. In the case 
of cell types, the attractors correspond to the observed stable gene configuration of 
each cell type. Thus, if our system consists in three different cell types, one cell type 
with GEN1 expression, other with GEN2 expression, and a third one with both GEN1 
and GEN2 expression, our set of expected attractors will be exactly this
Floral Gene Regulatory Network Models
Cell type 1 Ce l! type 2 
Cell type 1 1 O 
Cell type 2 O 1 
Cell type 3 
• Gtnl tllpr~u¡o n 
Cell type 3 1 1 
Gtnl ~Ilpr~u¡o n 
446
A generic protocol to postulate a GRN model for a particular 
developmental module would be as follows:
 (i) Identify a structural or functional developmental module 
(see Note 1).
 (ii) Based on available experimental data, select the set of poten-
tial nodes or molecular components that will be incorporated 
in the GRN model with the aim of integrating the key neces-
sary and sufficient components of the functional module 
under analysis. Then, explore the experimental data concern-
ing the spatio-temporal expression patterns of the genes to 
be incorporated in the model and assemble a table with a 
Boolean format of the expected configurations that should 
be recovered with the GRN model (such configurations are 
the “expected attractors”) (see Note 2).
 (iii) Integrate and formalize the experimental data concerning 
the interactions among the selected nodes using Boolean 
logical functions that will rule the Boolean GRN dynamics.
 (iv) The GRN is modeled as a dynamic system by exploring the 
states attained, given all possible initial configurations and 
the Boolean functions defined in (iii). The GRN is initialized 
in all possible configurations and followed until it reaches a 
fixed-point or cyclic attractor (see Note 3).
 (v) Compare the simulated attractors to the ones observed 
experimentally (expected attractors; see item (ii) above). 
A perfect coincidence would suggest that a sufficient set of 
molecular components (nodes) and a fairly correct set of 
interactions have been considered in the postulated GRN 
model. If this is not the case, additional components and 
interactions can be incorporated or postulated, or the 
Boolean functions can be modified. This allows to refine 
interpretations of experimental data, or to postulate novel 
interactions to be tested experimentally in the future. In any 
case, the process can be repeated several times based on the 
dynamical behavior of the modified versions of the GRN 
under study until a regulatory module is postulated. Such 
module can include some novel hypothetical interactions or 
components, integrate available experimental data, and iden-
tify possible experimental contradictions or holes.
 (vi) To validate the model, it is addressed if it recovers the wt and 
mutant (loss of function and gain of function) gene activa-
tion configurations that characterize the cells being consid-
ered. Perturbation analyses of the nodes and interactions, or 
the Boolean functions, can also be used for validating the 
model in order to test the robustness of the GRN under 
study. Eventually, novel predictions can be made and tested 
experimentally.
2.2  General Protocol
Eugenio Azpeitia et al.
447
 (vii) To recover the dynamics of the GRN and the temporal pat-
tern of attractor attainment, the logical functions can be 
modeled as stochastic ones. Observed temporal patterns of 
cell-fate or gene configurations attainment can be used to 
validate the GRN model under consideration.
 (viii) For further applications and also in cases that continuous 
functions are appropriate to describe the behavior of some of 
the components, the Boolean model can be approximated to 
a continuous one (see Subheading 2.5). Besides being useful 
for further modeling procedures, the continuous approxima-
tion is also a means of performing a robustness analysis of the 
GRN under study. Such a task hence implies as well a further 
validation of the model being postulated.
 (ix) Equivalent approaches to the ones summarized in (vi) and 
(vii) for discrete systems can be used in continuous ones.
There are two types of materials needed when modeling 
dynamic GRNs. First, the expected results to be recovered by the 
model that are extracted from the literature and depend on the 
aims of the model and the nature of the developmental module 
being considered, but generally include stable gene configurations 
(attractors), mutant phenotypes, and developmental transitions, to 
name a few. The second set is the software required for the analyses 
of the GRN. Currently there are several available programs for 
GRN analyses (see Note 8). In the following sections, we explain 
with more detail and specific examples how this general protocol 
can be applied. We start by explaining the simplest Boolean 
approach for dynamical GRN modeling.
In Boolean GRN models, nodes can only attain one of two possi-
ble values: “1” if the node is “ON,” and “0” if the node is “OFF.” 
A “0” node value usually represents that a gene is not being 
expressed, but can also represent the absence of a protein or hor-
mone, while a “1” node value represents that a gene is expressed 
or another type of molecular component is present. As mentioned 
above, the first step in building a network is to extract the neces-
sary experimental information to define the set of components to 
be considered in the GRN model, the set of expected attractors, 
and the Boolean functions that formally integrate the experimental 
data and define the dynamics of the GRN.
In Boolean GRNs, the network states (see Subheading 2.1) are 
defined by vectors of 0s and 1s. While a formal mathematical defi-
nition of attractors can be found on the chapter “Implicit Methods 
for Qualitative Modeling of Gene Regulatory Networks” of 
another Springer Protocols book [20], in Subheading 2.1 we give 
a more pragmatic definition of attractors, and we prefer to stick to 
it. In 1969, Kauffman proposed that the attractors of a GRN model 
2.3 Deterministic 
Boolean GRN Model
2.3.1  Expected Attractors
Floral Gene Regulatory Network Models
448
could correspond to stable gene configurations characteristic of 
particular cell types or physiological states (see Subheading 2.1; 
Fig. 2). Consequently, the expected attractors are defined from 
gene expression patterns obtained from the literature, as well as 
from other data sources that clearly define the spatio-temporal 
gene configuration of the system. For example, Espinosa-Soto and 
collaborators [7] defined the expected attractors from the gene 
expression patterns reported in scientific publications. In another 
study, La Rota and collaborators [19] integrated experimental data 
into a gene expression map for the sepal primordium. Based on its 
expression map they defined zones with different combinations of 
gene expression, and each zone corresponded to an expected 
attractor. Defining the expected set of attractors is an indispensable 
step when building the GRN model, because they are used to vali-
date the GRN (see below). Although it should be clear that the 
postulation of the Boolean functions is an independent task, and 
hence, it does not imply circularity.
In a Boolean GRN model the state of expression of each gene 
changes along time according to the dynamic equation
 x t + f x t x t x ti i kτ …( ) = ( ) ( ) ( )( )1 2, , , ,  (1)
in which the future state of gene i evolves temporally as a function 
of the current state of its k regulators. Boolean functions fi can be 
formalized as logical statements or as truth tables. Logical state-
ments use the logical operators “AND,” “OR” and “NOT” to 
describe gene interactions, while in truth tables the state of the 
gene of interest is given for all possible state combinations of its k 
regulators (see Note 4). Logical operators can be combined in 
order to describe complex gene regulatory interactions, and can 
always be translated into an equivalent truth table. In Fig. 3, we 
provide examples of common gene regulatory interactions formal-
ized as logical statements with their equivalent truth table. 
Consequently, in general, Boolean functions are generated from 
experimental evidence (but see Note 5). For example, if TGEN 
(a target gene) is ectopically expressed in a GEN1 loss-of-function 
background, it is inferred that GEN1 is a negative regulator of 
TGEN, and we use the “NOT” logical operator to describe GEN1 
regulation over TGEN or its equivalent truth table (Fig. 4). In this 
Boolean function, the state of TGEN at time t + τ is 1 if GEN1 
value is 0 at time t, and TGEN value at time t + τ is 0 if GEN1 value 
is 1 at time t (see Note 6).
The Boolean functions of the GRN developmental module 
being used here as an example, were grounded on available experi-
mental information [5, 7, 17–19]. As with expected attractors, 
Boolean functions can be grounded on different types of 
 experimental data, as long as they clearly state how genes interact 
(see Note 7). We now will provide an example of how the 
2.3.2  Boolean Functions
Eugenio Azpeitia et al.
449
experimental information was integrated and formalized as a 
Boolean function. During the transition from inflorescence to 
flower meristem, the expression of TERMINAL FLOWER 1 
(TFL1) needs to be repressed [21, 22], because TFL1 is a pro-
moter of inflorescence development [23]. TFL1 is transcribed in 
the center of the meristem and from there it moves to peripheral 
cells [24]. EMF1 is assumed to be a positive regulator of TFL1 
because the emf1 mutant is epistatic to tfl1 loss-of-function mutant, 
and both, tfl1 and emf1 mutants have similar phenotypes in terms 
of inflorescence meristem identity [25]. The over expression phe-
notype of AP1 is similar to the loss-of-function of TFL1, and in the 
ap1 mutant TFL1 is ectopically expressed, suggesting that AP1 is a 
negative regulator of TFL1 [26]. Similarly, TFL1 expression is not 
observed in LFY over expression and is ectopically expressed in 
LFY loss-of-function mutants [27]. According to these results, 
EMF1 is a positive regulator of TFL1, while AP1 and LFY are 
Fig. 3 Examples of common Boolean functions. Here we present four examples 
of common Boolean functions for a target gene, in this case TGEN, with two regu-
lators, namely, GEN1 and GEN2
Floral Gene Regulatory Network Models
o 
o 
1 
1 
o 
O 
1 
1 
o 
O 
1 
1 
o 
O 
1 
1 
o 
1 
O 
1 
o 
1 
O 
1 
o 
1 
O 
1 
o 
1 
O 
1 
o 
O 
O 
1 
o 
1 
1 
1 
o 
O 
1 
O 
o 
1 
O 
O 
TGEN(t+T) = GEN1(t) & GEN2(t) 
TGEN(t+T) = GEN1(t) I GEN2(t) 
TGEN(t+T) = GEN1(t) & ! GEN2(t) 
TGEN(t+T) = ! GEN1(t) & GEN2(t) 
450
negative regulators of TFL1. These results were formalized as a 
logical statement [18] as follows:
 TFL EMF AND NOT AP AND NOT LFY1 1 1=  
A complete list of the Boolean functions and the experimental 
evidence for this model can be found in refs. 7, 18; note some typo-
graphical errors corrected in refs. 1, 12.
Once the Boolean functions and the set of expected attractors of 
the GRN are obtained, we can proceed to make a first, necessary 
validation of the GRN. The first step is to use numerical simula-
tions to recover the attractors that our set of Boolean functions 
generates (see Note 8). The attractors recovered in the simulations 
must coincide with the expected attractors, based on experimental 
data. In Espinosa-Soto and collaborators [7] ten attractors were 
recovered. Four out of the ten attractors corresponded to gene 
activation configurations that characterize meristematic cells of 
inflorescence meristems, while the rest corresponded to the gene 
configurations observed in sepal, petal, stamen and carpel primor-
dial cells (Fig. 5). In the GRN for sepal development formulated 
by La Rota and collaborators [19], at least two attractors were 
recovered; one corresponding to the abaxial and the other one to 
the adaxial cells of the floral organ.
2.3.3 Validating the GRN: 
Simulated Attractors vs. 
Expected Attractors
Fig. 4 Truth table and logical statement of the example explained in the main text. 
(a) TGEN expression is not observed in the GEN1 loss-of-function background. 
Hence, we can assume that GEN1 is a negative regulator of TGEN. This Boolean 
function can be represented with a (b) truth tab. or a (c) logical statement
Eugenio Azpeitia et al.
a 
TGEN expresslon In 
WT 
b 
GEN1(t) 
o 
1 
TG[Nj H ) = ! G[N 1(t) 
TGEN cxprcssion in 
GENlloss-of· 
func-tion background 
TGEN(t+t ) 
1 
O 
451
Fig. 5 Obtained attractors of the flower organ specification GRN. In (a) we present the graph of the flower organ 
specification GRN proposed by Espinosa-Soto and collaborators [7]. The GRN recovered 10 fixed-point attractors. 
Six of the attractors corresponded to the observed gene configuration in the primordial cell of sepals (one attrac-
tor), petals (two attractors), statements (two attractors), and carpels (one attractor). (b) A flower meristem in which 
the primordial sepal cells are colored in green, primordial petal cells in brown grey, primordial stamens in orange, 
and primordial carpel cells in yellow. In (c), the ABC model and the floral organ determination GRN attractors that 
correspond to A, A + B, B + C, and C gene combinations, which specify sepal, petal, stamen, and carpel primordial 
cells, respectively. The activation states correspond to each of the GRN nodes starting on the left with “EMF1” and 
consecutively progressing clockwise the rest of the genes in the GRN shown in (a)
Floral Gene Regulatory Network Models
e A+B B+C 
A .--..... _.1_1_11 C 
I 01lII0l.-u OllOO1lO1111Ol1 01 1011101 11 1110 011011 1011001 10 I 
452
In cases in which the attractors recovered by the simulated 
GRN under study and those observed experimentally do not coin-
cide, additional nodes or interactions can be considered, or the 
postulated Boolean functions can be modified (Fig. 6). Such novel 
hypotheses can be tested by running the GRN dynamics once 
more, and if the simulated and observed (expected) attractors now 
coincide, the model can be used to postulate novel interactions, 
missing data, or contradictions among those that had been pro-
posed previously. For example, in Espinosa-Soto and collaborators 
[7] four missing interactions were predicted. Importantly, some of 
these predictions have been experimentally validated by indepen-
dent and posterior research, demonstrating the predictive capacity 
and usefulness of this approach.
An additional means to validate a GRN model is to simulate loss-
of- function (fixing the mutated gene expression value to 0) and 
gain-of-function (fixing the overexpressed gene expression value to 
1) mutants. The recovered attractors in the model with such altered 
fixed expression values must correspond to the effects experimen-
tally observed in the corresponding mutants (see Fig. 7; Note 9). 
If a discrepancy is found in such a validation process, additional 
hypotheses concerning new nodes or interactions can be postu-
lated. For the postulated GRN module underlying floral organ 
determination, most of the recovered attractors in the simulated 
mutants corresponded to the genetic configurations that have been 
observed experimentally [7, 17, 18]. In some cases, the simulated 
and observed (expected) attractors did not coincide and new 
 interactions were postulated. For example, in Espinosa-Soto and 
collaborators [7] a positive feedback loop was predicted for the 
2.3.4  Mutant Analysis
Fig. 6 The set of expected attractors vs. the set of obtained attractors. Both the set of expected and obtained 
attractors must coincide, when this do not happens it is usually assumed that there is some wrong or missing 
information
Eugenio Azpeitia et al.
Expected GENl 
attractors 
(ell type 1 1 o 
(ell type 2 1 1 
(ell type 2 o 
Obtained GENl 
attractors 
(ell type 1 
(ell type 2 
1 
1 
1!I'e-~ !! 
o 
o 
1 
453
gene AGAMOUS (AG), even though this seemed unlikely because 
in the ag-1 loss-of-function mutant plants, the AG expression 
pattern is the same as in wild-type plants [28]. In a posterior study 
in an independent laboratory, the prediction was verified experi-
mentally [29].
Simulations of mutants are also useful when trying to predict 
the effects of multiple mutants, which are complicated to generate 
in the laboratory. Moreover, even when the GRN involved in 
flower determination in Arabidopsis and Petunia seems to be 
 conserved, the mutant phenotypes are not identical. Espinosa-Soto 
and collaborators [7] used mutant analyses to test the effect of a 
Fig. 7 Loss-of-function and gain of function mutant simulations. Loss-of- function 
and gain-of-function mutant simulations are done by fixing the state of the 
desired gene to 0 and 1, respectively. In (a) the Boolean function of a non- 
mutated GEN1. In (b) and (c) the Boolean function of the same gene in a loss-of- 
function and a gain-of-function simulation, respectively. The Boolean functions 
are presented as truth tables and as logical statements. lof = loss-of-function, 
gof = gain-of-function
Floral Gene Regulatory Network Models
o 
O 
1 
1 
o 
O 
1 
WT GENl simulation 
o 
1 
O 
101 G E N1\ i mu l . ~ o n 
O 
1 
O 
o 
O 
O 
1 
O 
O 
O 
1 gof GENj simulation O 
O 
O 
1 
1 
O 
1 
O 
1 
1 
1 
1 
1 
TGEN expression 
IfGENl = 1 TGEN = 1 
If GENl = OTGEN = O 
TGEN = O 
TGEN = 1 
454
duplication in B genes that has been reported in Petunia, and 
recovered the single mutant that had been described, and at the 
same time predicted the expected phenotype for the double mutant 
of the two duplicates.
Experimental and theoretical work has demonstrated that living 
organisms are robust against perturbations. Moreover, at the 
molecular level the processes involved in different biological behav-
iors are also robust against internal and external variations. Such 
robustness implies that the overall functionality of the system 
remains when perturbed [30, 31]. In the case of GRNs, attractors 
should be robust when the Boolean functions are altered. In 
Espinosa-Soto and collaborators [7] the output value of every line 
of the truth tables was changed one by one. Interestingly, we found 
that the original attractors did not change for more than 95 % of 
the logical table alterations, indicating that the functionality of the 
postulated developmental module is robust to this type of pertur-
bation. There are other types of perturbation analyses. For example, 
we could change with a certain probability the value of a line of the 
truth table, or the state of the network. Similarly, if we perturb the 
GRN with these other types of perturbations, the systems’ attractors 
are expected to be maintained.
In deterministic GRN models, as the Boolean model exposed 
above, the system under study always converges to a single attrac-
tor if initialized from the same configuration, and once it attains 
such steady-state, it remains there indefinitely. However, during a 
developmental process, cells change from one stable cell configura-
tion to another one in particular temporal and spatial or morpho-
genetic patterns. In order to explore questions such as how 
differentiating cells decide between one of the available attractors, 
or the order in which the system converges to the different attrac-
tors, given an initial condition, and to make statistical predictions 
of such possible behaviors, a stochastic formalism is needed.
In this section we develop a discrete stochastic model as an 
extension of the deterministic Boolean GRN. We then show how 
this approach can be used to explore the patterns of cell-fate attain-
ment. Specifically, the model formalism explained here allows the 
investigation of the temporal sequence with which attractors are 
visited in the GRN when noise or random perturbations drive the 
system from one attractor to any other one.
In a Boolean GRN model the dynamics given by Eq. 1 is determin-
istic: for a given set of Boolean functions fi (see Subheading 2.3.2), 
the configuration of the network at time t completely determines 
the configuration of the network at the next time step t + 1 (con-
ventionally τ = 1). If Eq. 1 is iterated starting from a given initial 
configuration (defined by an array of n entries with 0s and 1s 
2.3.5 Robustness 
Analyses
2.4 Stochastic 
Boolean GRN Model: 
Temporal Sequence of 
Cell-Fate Attainment
2.4.1 From Deterministic 
to Stochastic Models
Eugenio Azpeitia et al.
455
representing the activation states of the n genes), the network will 
eventually converge to an attractor. This deterministic version 
implies that once the system reaches an attractor, it remains there 
for all subsequent iterations. However, if noise is introduced into 
either the Boolean functions, or the gene states, there is a finite 
probability for the system to “jump” from one basin of attraction 
to another one (for definitions, see Subheading 2.1) and conse-
quently, from one attractor to another one. Such a stochastic 
Boolean model of the GRN enables the study of transitions among 
attractors.
Noise can be implemented in a Boolean GRN model in several 
ways (see Note 10). Here we implement noise by introducing a 
constant probability of error ξ for the deterministic Boolean func-
tions. In other words, at each time step, each gene “disobeys” its 
Boolean function with probability ξ, such that in the stochastic 
version, Eq. 1 is extended to
 x t
f t with prob
f t with probi
i
i
+( ) =
( ) −
− ( )




τ
ξ
ξ
, .
.
1
1
 (2)
Note that the stochastic version (e.g., Eq. 2) reduces to a 
deterministic one (Eq. 1)) when ξ = 0. In the model, the stochastic 
perturbations are applied independently and individually to each 
gene at each iteration. This implementation of noise for stochastic 
Boolean modeling of GRNs has been referred to as the stochastic-
ity in nodes (SIN) model with the assumption of a single fault at a 
time [20, 32].
When Eq. 2 is iterated, both the set of Boolean functions fi and the 
error probability ξ determine the configuration of the network at 
the next time step. Under this stochastic dynamics, a given initial 
configuration will no longer converge to the same attractor each 
time. This situation allows us to estimate a probability of transition 
from one network state to another state as the frequency with 
which this transition occurs in a large number of repetitions of the 
same iteration (see below). The estimated transition probabilities 
can then be used to study the behavior of the system and to make 
statistical predictions.
As we want the model to be useful in the exploration of the 
patterns of temporal cell-fate attainment, the network states that 
we are interested in are the fixed-point attractor states that repre-
sent the cell types. Thus, we need to estimate the probability pij of 
transition from the attractor i to the attractor j. From the deter-
ministic Boolean model, we already know to which attractors the 
network converges. In the following we use the term attractor to 
refer to both, the attractor and its basin. Thus, we can define a 
scalar (single-valued) variable Xt to describe the state of the net-
work in terms of the specific attractor in which the network is in at 
2.4.2 The Transition 
Probability Matrix
Floral Gene Regulatory Network Models
456
time t. Then, Xt will take at time t any value from the ordered set 
(1,2,i, …,K) where each i represents one specific attractor from the 
available k attractors. The configuration of the network at time t is 
then related to the configuration at time t + 1 through what is 
known as the transition probabilities. If the network is in attractor 
i at time t, at the next time step t + 1, it will either stay in attractor 
i or move to another attractor j.
Formally, pij denotes a one-step transition probability that is 
defined as the following conditional probability:
 p Prob X j | X iij t+ t= = ={ }1 ,  (3)
the probability that the network at time t + 1 is in the attractor j 
given that it was in the attractor i at the previous time t, where 
i, j = 1, 2, …, K for K attractors. The set of probabilities pij can be 
expressed in matrix form:
 
P
p p
p p
k
k kk
=
æ
è
ç
ç
ç
ö
ø
÷
÷
÷
11 1
1
.
⋯
⋮
⋯
⋱ ⋮
 
As the number of attractors K is finite, P is a K × K transition 
matrix. Operationally, under the current model, one can estimate 
the probabilities of the i-th row by first iterating Eq. 2 one time 
step starting from a given initial configuration corresponding to 
the basin of attraction of attractor i. If, after the iteration, the sys-
tem remains in the same attractor, or the same basin of attraction, 
one count is added to the diagonal entry that corresponds to Pii. If 
the configuration ends up in a different basin j, the count is added 
to the column j that corresponds to pij. This process is repeated a 
large number of times (e.g., 10,000) for each of the possible Ω = 2n 
initial conditions. For each state (attractor), the one-step transition 
probabilities should satisfy 
j i
K
ijp
=
=S 1  and pij ≥ 0. This means that in 
the transition matrix P, the rows must sum to 1. This is achieved 
by dividing the number of counts in each matrix entry by the total 
number of configurations that started in the corresponding matrix 
row (e.g., basin i). As the dynamics in Eq. 2 are driven by both the 
Boolean functions fi and the error probability ξ, given a fixed set of 
Boolean functions, different values of ξ will result in different values 
of the transition probabilities pij (see Note 11).
Once the transition matrix P is calculated, it can be used in a dynamic 
model to describe how the probability of being in a particular attrac-
tor changes in time. In other words, we are now in position to derive 
a probabilistic dynamic model to simulate the dynamics of temporal 
cell-fate attainment.
In the previous subsection, the dynamics of transition between 
attractor states were defined in terms of transition probabilities. 
2.4.3 The Probabilistic 
Dynamics of Cell-Fate 
Attainment
Eugenio Azpeitia et al.
457
When this is the case, the state of the network at any given time Xt 
can only be represented by its associated discrete probability 
distribution. We denote this distribution by the vector pX(t) = (p1(t), 
p2(t), …, pK(t)), where pi(t) represents the probability of the 
network being in attractor i at time t, and 
i i
K
ip t
=
( ) =S 1 .
Given pX(t), the probability distribution associated with Xt + 1 
can be found by multiplying the transition matrix P by pX(t). We 
obtain the following dynamic equation
 p t 1 p t P
X X
+( ) = ( ) ,  (4)
this latter equation projects the process forward in time, and it 
allows us to follow the dynamics of the probabilities of cell-fate 
attainment by means of straightforward iteration.
In order to do so, it is necessary to specify an initial vector 
p
X
(t = 0) which represents the probability distribution of the net-
work state at time t = 0. In biological terms, this initial vector can 
be interpreted as the representation of how a large population of 
cells is distributed over the available attractors. In other words, 
how many cells of each type are in the population at the initial time 
t = 0. As the probabilities pi sum to one, an underlying assumption 
is that the number of cells in the population remains constant. 
In the next subsection we show how this initial distribution can be 
chosen based on a biological motivation in order to explore a spe-
cific question regarding the dynamics of cell-fate attainment dur-
ing floral organ formation. When the matrix P and the initial vector 
pX(0) are specified, Eq. 4 can be iterated (see Note 12); this process 
will generate a trajectory for the temporal evolution of the proba-
bility of each of the attractors. Every attractor will have a maximum in 
the probability of being reached at particular times. This maximum 
corresponds to the moment at which the corresponding cell-fate is 
most likely. Thus, the order in which the maximal probability of 
the different attractors is reached may serve as an intrinsic explana-
tion for the emerging temporal order during early stages of devel-
opment. Note that, as the transition probabilities of the matrix P 
depend on the value of ξ used in Eq. 2, the trajectories for the 
probability of attractor attainment will vary for different values of 
the error probability ξ.
In this subsection we show how the modeling formalism presented 
above can be applied to propose mechanistic explanations of observed 
patterns of temporal cell-fate attainment. In the modeling framework 
presented here, stochasticity may seem just as a modeling artifact that 
allows the study of transitions among attractors. However, a multi-
tude of studies have demonstrated both theoretically and experimen-
tally that stochasticity and the so-called biological noise are 
ubiquitously present in biological systems given the chemical nature 
of biological processes (for example see refs. 33–36).
2.4.4 Temporal Cell-Fate 
Pattern During Early 
Stages of Flower 
Development
Floral Gene Regulatory Network Models
458
Under the hypothesis that random fluctuations in a system 
may be important for cell behavior and pattern formation, Alvarez- 
Buylla and collaborators proposed a discrete stochastic model to 
address whether noisy perturbations of the GRN model for the 
floral organ determination of A. thaliana are sufficient to recover 
the stereotypical temporal pattern in gene expression during flower 
development [4]. As mentioned above, previous analysis of the 
deterministic Boolean GRN showed that the system converges 
only to ten fixed-point attractors, which correspond to the main 
cell types observed during early flower development [7]. Six of 
the attractors correspond to the four floral organ primordial cells 
within the flower meristem: sepals, petals, stamens, and carpels 
(S, P1, P2, S1, S2, and C).
Following Subheading 2.4.2, we can study the dynamics of 
cell-fate attainment of the floral organ primordial cells by defining 
a variable Xt which can take as a value any of the attractors (S, P1, 
P2, S1, S2, and C) at each time t. Then, given the six attractors of 
interest, we would like to estimate the transition matrix P, with the 
transition probabilities pij of transition from attractor i to attractor 
j as components. This matrix can be estimated by iterating Eq. 2 
and following the algorithm described in Subheading 2.4.2. 
Alvarez-Buylla and collaborators [4] followed a similar approach, 
and estimated the matrix P shown in Table 1. This matrix was esti-
mated using a value of 0.01 for the probability of error ξ in Eq. 2.
We follow the temporal evolution of the probability of reach-
ing each attractor by iterating Eq. 4 using as P the matrix just 
estimated (see Table 1). However, as mentioned in Subheading 2.4.3, 
it is necessary to specify an initial distribution pX(0), which defines 
what fraction of the whole cell population corresponds to each of 
the cell-types (S, P1, P2, S1, S2, and C) at the initial time of the 
Table 1 
Example of a transition matrix P estimated from the GRN model for the floral 
organ determination of A. thaliana. The matrix elements are the transition 
probabilities among pairs of the six attractors (S, P1, P2, S1, S2, and C). 
Probabilities where calculated in Alvarez-Buylla et al. [4] using (ξ = 0.01)
sep pe1 pe2 st1 st2 car
sep 0.939395 0.001943 0.009571 0.000083 0.000490 0.048517
pe1 0.036925 0.904162 0.009250 0.033900 0.000488 0.015275
pe2 0.009067 0.000464 0.941609 0.000024 0.048374 0.000461
st1 0.000084 0.001893 0.000020 0.936514 0.009960 0.051530
st2 0.000020 0.000001 0.002074 0.000356 0.987953 0.009597
car 0.002045 0.000034 0.000020 0.001951 0.010020 0.985930
Eugenio Azpeitia et al.
459
simulation. Since sepal primordial cells are the first to attain their 
fate in flower development, we use as an initial distribution a vector 
in which the value corresponding to the fraction of sepal cells is set 
to 1 and all the other values are set to zero; this is pX(0) = (1,0,0,0,0,0), 
where the order of the values is (S, P1, P2, S1, S2, and C). Thus, 
initially, all of the population of cells within a floral primordium is 
in the sepal attractor. Then, Eq. 4 can be iterated to follow the 
changes in the probability of reaching each one of the other attrac-
tors over time, given that the entire system started in the sepal 
configuration. The resulting normalized trajectories for the case in 
point are shown in Fig. 8 (see Note 13). The graph clearly shows 
how the trajectory for each of the attractor’s probability reaches 
its maximum at a given time. One star for each of the attractors 
was drawn in the graph just above the x-axis at the time when its 
maximal probability occurs. In accordance with biological obser-
vations, the results show that the most probable sequence of cell 
attainment is: sepals, petals, and the stamens and carpels almost 
concomitantly.
The results presented here were calculated using just one value 
for the probability of error (ξ = 0.001). In the work of Alvarez- 
Buylla and collaborators [4], it was shown that the system exhibited 
a sequence of transitions among attractors that mimics the sequence 
of gene activation configurations observed in real flowers for a level 
of noise (value of ξ) of around 0.5–10 % (see Note 11).
Fig. 8 Temporal sequence of cell-fate attainment pattern under the stochastic 
Boolean GRN model. Maximum relative probability p of attaining each attractor, 
as a function of time (in iteration steps). The value of the error probability used 
was ξ = 0.01. Stars mark the time when maximal probability of each attractor 
occurs. The most probable sequence of cell attainment: sepals, petals, carpels, 
and stamens
Floral Gene Regulatory Network Models
~ 
~ 
n, • Max-up 
~ 
" 
• Mp;.pe 
~ 
" • Max.st 
'9: '" • Max-car 
;; 
~ 
" 
g * * * 
D " .. " .. lDO 
460
The nonintuitive, constructive role of moderated noise 
perturbing the dynamics of nonlinear systems is a well-known phe-
nomenon in physics [37]. Currently, there is a growing interest in 
understanding the interplay between noise and the nonlinearity of 
biological networks [38]. Using the model formalism presented 
here, Alvarez-Buylla and collaborators concluded that the stereo-
typical temporal pattern with which floral organs are determined 
may result from a stochastic dynamic system associated with a 
highly nonlinear GRN [4]. In the light of these findings, the mod-
eling framework exposed in this section constitutes a simple 
approach to understanding morphogenesis, providing predictions 
on the population dynamics of cells with different genetic configu-
rations during development.
Boolean GRNs have been useful to study the complex logic of 
transcriptional regulation involved in cell differentiation because it 
seems that the qualitative topology of such networks, rather than 
the detailed form of the kinetic functions of gene interactions, rule 
the attractors reached. However, for some further mathematical 
developments and also for studies of the detailed behavior of GRN 
dynamics, the differences in genetic expression decay rates, thresh-
old expression values, saturation rates, and other quantitative 
aspects of GRNs can become very relevant. These aspects of GRNs 
cannot be contemplated by a discrete approach. Hence, it becomes 
necessary to investigate also continuous representations of GRN 
dynamics. Several studies reviewed here show that such continuous 
approximations of the discrete GRNs lead to novel predictions, but 
at the same time recover consistent results with those arising in the 
Boolean framework.
Several approaches have been used to describe the Boolean 
GRN as a continuous system. A well-known scheme is the piece- 
wise linear Glass dynamics of the network [39]. This model is 
based on a set of differential equations in which each continuous 
variable xi, representing the level of expression of a given gene, has 
an associated discrete variable that represents the state of expres-
sion of that gene. This is accomplished by introducing the discrete 
variables x̂i  defined as x̂ H xi i i= -( )q , where θi represents a 
threshold, and H(x) is the Heaviside step function: H(x) = 1 if x > 1, 
and H(x) = 0 if x < 1. This definition implies that gene n displays a 
dichotomic expression driven by a more gradual continuous 
dynamics. The piece-wise continuous Glass dynamics of the GRN 
is described by
 
d
d
, ,^ ^
1
x t
t
f t t x tx xi
i k i
( )
= ( ) ( )æ
è
ç
ö
ø
÷ ( )é
ëê
ù
ûú
m ¼ -  (5)
where fi are the input functions of the discrete Boolean model, 
and μ = 1/τ is the relaxation rate of the gene expression profile. 
Within this description, the microscopic configuration of the GRN 
2.5 Approximation  
to a Continuous GRN 
Model
2.5.1 Deterministic 
Approach
Eugenio Azpeitia et al.
461
at a given time is described by the set of continuous values 
{x1(t), …, xk(t)}; this set induces in turn the set of corresponding dis-
crete values ˘ ˘, ,1x t x tk( ) ( ){ }…  as the Boolean configuration of the 
network. The equilibrium states of the GRN that determine a given 
phenotype may be obtained from the condition dxi/dt = 0, which 
leads to
 x f x t x ti i k
S
1
S S˘ ˘, ,= ( ) ( )( )…  (6)
independently of the value of the relaxation rate. Even when the 
Boolean input functions fi are the same in the discrete and continu-
ous approaches, there are infinitely many microscopic configura-
tions compatible with the same Boolean configuration, and the 
discrete model of the GRN and the corresponding continuous 
piece-wise linear model are not necessarily equivalent, since the 
attractors of the two models can be different. However, numerical 
simulations to study the GRN for floral organ differentiation in A. 
thaliana, show that the Glass dynamics generate exactly the same 
ten fixed-point attractors obtained in the Boolean model, although 
the size of the corresponding attraction basins may display some 
variation [4].
An alternative approach consists in considering that the input 
functions display a saturation behavior characterized by a logistic 
or a Hill function, usually employed in biochemistry to describe 
ligand saturation as a function of its concentration. In the first case, 
the input associated to node i may be included in the form
 
Θ …
… ∈
f x x
b f x x
i k
i i k i
1
1
, , , ,
1
1 exp ,, , ,
( )  =
+ − ( )−  
 (7)
where ∈ i is a threshold level (usually ∈ i = 1/2), and bi the input 
saturation rate. It may
be easily seen that for bi > > 1, the input function becomes a 
Heaviside step function:
 Θ ∈ → ∈f H fi i i i−[ ] −[ ],  (8)
and thus displays a dichotomic behavior (in practice this may be 
achieved for, e.g., bi > 10). This approach has been employed, for 
example, in the modeling of the GRN for differentiation of 
Th cells of the immune response by Mendoza and Xenarios [40], 
or in the study of floral organ specification in A. thaliana [1].
On the other hand, Hill-type inputs of GRNs have been 
employed in a number of investigations on biological development 
and differentiation (see the review in ref. 41). They have the following 
structure:
 Ξ
∈
n
i
i i
n
i
n
i
n
f
A f
f
( ) [ ] =
( )
( ) + ( )
,  (9)
Floral Gene Regulatory Network Models
462
with the parameter n, an integer number, and Ai the maximum 
asymptotic value attained by the input. The latter approach was 
used by Zhou et al. [42], to model pancreatic cell fates; and by 
Wang and coworkers [43] to study myeloid and erythroid cell 
fates. The approximation to be used depends on the nature of the 
problem under study. In fact, the GRN inputs could be described 
also by any set of polynomial functions that reflect the biological 
interactions of the network.
Another approach that can be used to translate the logical into 
continuous functions involves the use of “fuzzy logics” proposed 
by L. A. Zadeh [44] to study systems that do not follow strictly 1 
or 0 truth-values. This is achieved by using the following rules
 
x t and x (t) x t x t
x t or x (t) x t x t
i j i j
i j i j
( ) ( ) ( ) 
( ) ( ) (
→
→
min ,
max , ) 
( ) −not x t x (t)
.
i i→1  (10)
Here, the operators, min and max mean to choose between the 
minimum and maximum values of the functions xi and xj at a given 
time t. It can be shown that these rules lead to a Boolean algebra 
[1]. One possible disadvantage of this proposition is that it involves 
only piece-wise differential functions. Another possibility is to con-
sider the following algorithm:
 
x t and x (t) x (t) x (t)
x t or x (t) x (t) x (t) x (t) x
i j i j
i j i j i j
( )
( ) + −
→ ⋅
→ ⋅ (t)
not x t x (t)i i( ) −→1
.
 (11)
The structure of the expressions associated to the logical con-
nectors “and” and “not” is obvious, while the expression for “or” 
is derived by substituting such expressions into De Morgan’s law: 
not(xi or xj) = (not xi) and (not xj). As before, it may be straightfor-
wardly checked that these rules define a Boolean algebra. For 
example, a logic input like
 f x or x and not x1 1 2 3= ( ) ( )  
would read:
 f x x x x x .21 1 2 1 31= + -( ) -( )×  
We now proceed to write the equation for the GRN continu-
ous dynamics. By assuming that the source of gene activation can 
be characterized, for example, by a logistic-type behavior, we may 
introduce the following set of differential equations:
 d
d
,, , ,1
x
t
f x x xi
i k i i= ( )  −Θ … µ  (12)
where μi = 1/τi represents the expression decay rate of node i of the 
GRN. Notice that within this approach we consider that, in gen-
Eugenio Azpeitia et al.
463
eral, each gene may have its own characteristic decay rate. 
This assumption introduces further richness into the description, 
as a hierarchy of times of genetic expression may define alternative 
routes to cell fates. In particular, notice that the steady states of the 
GRN, given by the condition dxi/dt = 0, lead to the expression
 x f x x .i
S
i
i k= ( )



1
,, , ,1
S S
µ
Θ …  (13)
Taking into account that the node inputs are defined by logical 
sentences with a Boolean architecture, then the attractor set 
obtained in this case is equivalent by construction to the set derived 
in the discrete Boolean approach. Thus, if a given attractor arising 
in the discrete Boolean approach has an expression pattern like 
{1,0,0,1,1, …}, the corresponding pattern in the continuous 
approach would have the structure {1/μi, 0, 0, 1/μ4, 1/μ5, …}, so 
that they become identical when μi = 1 (with the possible exception 
of some isolated attractors). The consideration of the several relax-
ation rates for gene expression dynamics introduces an important 
difference with respect to Glass dynamics. For example, in the case 
that a gene has a large decay rate, corresponding to μi > > 1, then 
xi
S → 0, and the expression pattern would differ with that arising 
when μi = 1. Then, the dynamic behavior of a gene with a large decay 
rate (short expression time) would be equivalent to an effective 
mutation associated to lack of functionality. Similarly, the case μi < < 1 
would correspond to an over-expression of that gene. We conclude 
that the gene expression dynamics is not only regulated by the GRN 
interactions topology, but also by the hierarchy of relative expres-
sion times of its components.
On the other hand, the system also may acquire very different 
behaviors depending on the value of the saturation rate. As men-
tioned before, for bi > > 1, the input function becomes a Heaviside 
step function. In the case, bi = 1, the input function would show a 
softer behavior. It turns out that in this latter case the attractor set 
may change drastically with respect to that obtained in the Boolean- 
like case. This plasticity could be employed to study regulatory 
systems with a hybrid functionality consisting of transcriptional reg-
ulatory logics that are well described with Boolean GRN, and exter-
nal or coupled signaling transduction pathways that have continuous 
behaviors and which can impact the dynamics of some of the GRN 
components.
3  Notes
 1. A developmental module incorporates a set of necessary and 
sufficient molecular components for a particular cell differen-
tiation or morphogenetic process. It is considered a module 
because it is largely robust to initial conditions and it attains 
Floral Gene Regulatory Network Models
464
certain attractors robustly. The uncovered GRN underlying 
the ABC patterns of gene activation and the early subdifferen-
tiation of the flower meristem into four concentric regions or 
primordial floral organ cells, thus constitutes a developmental 
model. Other developmental modules involved in flower 
development could be those involved in: the cellular subdif-
ferentiation of each one of the floral organ primordia during 
organ maturation, determining floral organ number and spatial 
disposition, in the dorso-ventrality or shape of floral organs, 
ovule maturation, etc.
 2. In the table that formalizes the experimental data, if the gene 
or protein is expressed register a “1,” and if not a “0.” If some 
components have expression patterns with cyclic behavior, 
they could be part of cyclic attractors. In some cases, a discrete 
network with more than two activation states can be postu-
lated if deemed necessary. Quantitative variation in expression 
levels can be also incorporated later in a continuous model 
approximated from the discrete one.
 3. Several other algorithms exist to numerically find the attractors of 
a Boolean Network in an efficient way. For examples, see ref. 20.
 4. It is important to keep in mind that the “AND” and “OR” 
logical operators can be interconverted. For instance, the logi-
cal statement “GEN1 AND GEN2” is equivalent to the logical 
statement “NOT (NOT GEN1 OR NOT GEN2).” Because 
of this, most truth tables (except the simplest ones, like the 
constants) have many equivalent logical statements. 
Consequently, each Boolean function can be formalized as a 
unique truth table, but can be described with one or many 
equivalent logical statements (Fig. 9).
 5. Sometimes, the experimental information is not enough to 
completely define the Boolean functions. For example, in La 
Fig. 9 Equivalence between truth tables and logical statements. As observed each truth table have many 
equivalent logical statements while each logical statement is represented by a unique truth table
Eugenio Azpeitia et al.
TGENlt,",) = GEN!lt) & IGEN2It) I GEN3It)) 
TGENlt+T) = IGEN!lt) & GEN3It)) I 
o o o o IGEN!lt) & GEN2It)) 
o o 1 O 
O 1 O O = TGENlt,",) = IGEN!lt) & ! GEN2It) & GEN3It)) I 
O 1 1 O 
IGEN!lt) & GEN2It) & ! GEN3It)) I 
1 O O O 
IGEN!lt) & GEN2It) & GEN3It)) 
1 O 1 1 TGENlt,",) = ! I ! GEN!lt) & I ! GEN2It) I ! GEN3It))) 
1 1 O 1 
1 1 1 1 
465
Rota and collaborators [19] Boolean functions were first 
 generated considering only confirmed direct molecular inter-
actions. However, gaps in the experimental information pre-
cluded the generation of a unique set of Boolean function 
determining the GRN. Consequently, they predicted possible 
interactions by looking for consensus binding sites in the pro-
moters of the included nodes and introducing some specula-
tive hypothesis of molecular interactions.
For example, imagine that TGEN expression disappears 
when you generate single loss-of-function alleles of GEN1 and 
GEN2, while TGEN expression is promoted if we over-express 
both GEN1 and GEN2. Consequently, we conclude that 
GEN1 and GEN2 are both positive and necessary regulators 
for TGEN expression. However, this experimental data do not 
say anything about what happens to TGEN expression in the 
simultaneous absence of GEN1 and GEN2. In such a case we 
would have an incompletely characterized Boolean Function 
(Fig. 10). Such incompletely characterized Boolean functions 
can also appear due to asynchrony and interactions with the 
environment [45]. The inclusion of asynchrony in the model 
provides a more realistic description of our system, while envi-
ronmental inputs influence is pervasive in biological systems. 
Hence, the incorporation of incomplete Boolean functions in 
a model is an instrumental tool. There are many ways to approach 
this problem: we could test all possible Boolean functions (as in 
ref. 19), introduce asynchrony in our model, give a probability 
to each possible Boolean function, or even directly work with 
incomplete Boolean functions. Several free software programs 
are capable of considering asynchrony, probabilities for differ-
ent logical functions or can work with incomplete Boolean 
functions, such as ANTELOPE [45] and BoolNet [46].
Fig. 10 Complete and incomplete characterized Boolean functions. While in complete characterized Boolean 
functions the value of TGEN in all row of the truth tables is specified, in incomplete characterized Boolean 
functions in one or more rows of the truth table is not specified. Incomplete characterized Boolean function can 
be the result of missing information data, asynchrony or environmental perturbations and can be resolved with 
different approaches as explained in the main text
Floral Gene Regulatory Network Models
Complete charactcf tZed aoorcan funcnon 
o 
O 
1 
1 
O 
1 
O 
1 
O 
O 
O 
1 
Incomplete characterizcd aoorcan funcnon 
O 
O 
1 
1 
O 
1 
O 
1 
• 
O 
O 
1 
466
 6. Sometimes we cannot represent the available experimental 
data with a Boolean formalism because we need more values to 
represent our nodes’ activity. For example, imagine that GEN1 
differentially affect TGEN in the loss-of function, when nor-
mally expressed and when over expressed. This can be resolved 
replacing the Boolean formalism with a multivalued or a 
continuous approach. In a multivalued approach, the nodes 
can take as many values as necessary. In the last example, we 
could allow GEN1 to have three values, namely, 0 when is 
OFF, 1 when is normally expressed and 2 when is over 
expressed. It is important to note that a Boolean formalism can be 
approximated to a continuous one as was explained in the last 
section of this paper. For example, Espinosa-Soto and collabo-
rators [7] initially followed a multivalued modeling approach, 
which was later shown to yield the same qualitative results 
when transformed into a Boolean system [17]. Similar situa-
tions have been documented when transforming a continuous 
into a Boolean model (e.g., [6, 47]). Currently some software 
applications allow the analysis of discrete multivalued networks 
(e.g., GINSIM) [48].
 7. As mentioned above, sometimes the experimental information 
is not enough to generate the Boolean function. We can also 
find contradictory information linked to particular gene inter-
actions. For example, one author may report that GEN1 posi-
tively regulates TGEN, while another one may report that 
GEN1 is a negative regulator of TGEN. In cases like this, 
models are extremely helpful, even when they could be consid-
ered incomplete. With models we can test both suggestions in 
a fast and cheap way. The result that better reproduces the 
experimentally observed system’s behavior should be consid-
ered the most likely hypothesis. For example, in La Rota and 
collaborators [19] GRN model of sepal primordium they gen-
erated multiple sets of Boolean functions describing their GRN 
and selected those that recovered the expected attractors and 
mutant phenotypes. At other times GRN models can be also 
used to explain apparent contradictions or disputes concerning 
the interpretation of experimental data.
 8. There are several free software packages to recover the attrac-
tors and basins of attraction of Boolean GRN, including 
ANTELOPE [45], GINSIM [48], BoolNet [46], Atalia [12], 
GNbox [49], GNA [50], and BioCham [51].
 9. It is important to note that recovering the expected attractors 
when the mutants are simulated does not guarantee that the 
model is correct, because networks with different topologies 
can sometimes reach the same attractors [52]. However, we 
can assure that a GRN model that is unable to reproduce all 
mutants is incorrect.
Eugenio Azpeitia et al.
467
 10. Although stochasticity in Boolean models of GRNs is commonly 
modeled using the SIN model (see Subheading 2.4.1), another 
method called the stochasticity in functions (SIF) has been 
introduced recently. The objective of this method is to model 
stochasticity at the level of biological functions (i.e., Boolean 
functions in the GRN), and not just by flipping the state of a 
gene as in the SIN model (for details see refs. 20, 32).
 11. It could be the case that interesting, nontrivial behaviors may 
be uncovered just at certain levels of the error probability ξ 
(e.g., noise). Thus, as customary in numerical explorations, it 
is necessary to test different values of ξ. However, one expects 
generic, robust behavior to be observed under a relatively wide 
range of noise levels. Moreover, the stochastic modeling of 
GRN can thus be useful to make inferences concerning the 
range of noise levels that are experienced in particular develop-
mental systems under study.
 12. When trying to iterate Eq. 4, make sure that the order in which 
the position corresponding to each attractor state in the initial 
vector pX(0) is the same as the one for the columns in the tran-
sition the matrix P. In other words, if the fraction of cells in 
attractor A is specified in the position i of the initial vector, the 
row i of the transition matrix should correspond to the prob-
abilities of transition from attractor A to the other attractors.
 13. It can be the case that the heights of the trajectories, which 
correspond to the temporal evolution of the probability of 
being in each attractor, differ considerably. This is to be 
expected; given that the basins of the different attractors vary 
in size, and so do their absolute probabilities. One way to 
transform the data in order to obtain a graph where the heights 
of the trajectories are of comparable size is to normalize each 
probability value with respect to the maximum of each attrac-
tor’s curve (e.g., dividing the probability value by the  maximum 
value). We followed this approach to obtain the graph in Fig. 8, 
where also the trajectories corresponding to attractors se1 and 
se2; and st1 and st2 where respectively added to obtain only 
one trajectory for the attractor se and one for st. However, it is 
important to note that, as we are interested in the temporal 
order in which the attractors reach its maximum probability, 
this normalization process is not necessary. The order of 
appearance of the maximum value of the probability of each 
attractor in the original simulated trajectories would be the 
same as the one observed in the normalized trajectories. The 
normalization step just allows us to obtain a clearer graph. In 
the graph in Fig. 7, we draw one star for each of the attractors 
just above the x-axis at the time when its maximal probability 
occurs. The observed pattern is exactly the same in the simu-
lated trajectories before the normalization.
Floral Gene Regulatory Network Models
468
References
 1. Villarreal C, Padilla-Longoria P, Alvarez-Buylla 
ER (2012) General theory of genotype to phe-
notype mapping: derivation of epigenetic land-
scapes from N-node complex gene regulatory 
networks. Phys Rev Lett 109(118102):1–5
 2. Alvarez-Buylla ER, Balleza E, Benítez M, 
Espinosa-Soto C, Padilla-Longoria P (2008) 
Gene regulatory network models: a dynamic 
and integrative approach to development. SEB 
Exp Biol Ser 61:113–139
 3. Alvarez-Buylla ER, Azpeitia E, Barrio R, 
Benítez M, Padilla-Longoria P (2010) From 
ABC genes to regulatory networks, epigenetic 
landscapes and flower morphogenesis: making 
biological sense of theoretical approaches. 
Semin Cell Dev Biol 21(1):108–117
 4. Alvarez-Buylla ER, Chaos A, Aldana M, 
Benítez M, Cortes-Poza Y, Espinosa-Soto C, 
Hartasánchez DA, Lotto RB, Malkin D, 
Escalera Santos GJ, Padilla-Longoria P (2008) 
Floral morphogenesis: stochastic explorations 
of a gene network epigenetic landscape. PLoS 
One 3(11):e3626
 5. Mendoza L, Alvarez-Buylla ER (1998) 
Dynamics of the genetic regulatory network 
for Arabidopsis thaliana flower morphogene-
sis. J Theor Biol 193(2):307–319
 6. Albert R, Othmer HG (2003) The topology of 
the regulatory interactions predicts the expres-
sion pattern of the segment polarity genes in 
Drosophila melanogaster. J Theor Biol 223(1): 
1–18
 7. Espinosa-Soto C, Padilla-Longoria P, Alvarez- 
Buylla ER (2004) A gene regulatory network 
model for cell-fate determination during 
Arabidopsis thaliana flower development that 
is robust and recovers experimental gene 
expression profiles. Plant Cell 16:2923–2939
 8. Azpeitia E, Benítez M, Vega I, Villarreal C, 
Alvarez-Buylla ER (2010) Single-cell and cou-
pled GRN models of cell patterning in the 
Arabidopsis thaliana root stem cell niche. 
BMC Syst Biol 4:134
 9. Albert I, Thakar J, Li S, Zhang R, Albert R 
(2008) Boolean network simulations for life 
scientists. Source Code Biol Med 3:16
 10. Albert R, Wang RS (2009) Discrete dynamic 
modeling of cellular signaling networks. 
Methods Enzymol 467:281–306
 11. Assmann SM, Albert R (2009) Discrete 
dynamic modeling with asynchronous update, 
or how to model complex systems in the 
absence of quantitative information. Methods 
Mol Biol 553:207–225
 12. Alvarez-Buylla ER, Benítez M, Corvera-Poiré 
A, Chaos CA, de Folter S, Gamboa de Buen A, 
Garay-Arroyo A, García-Ponce B, Jaimes- MF, 
Pérez-Ruiz RV, Piñeyro-Nelson A, Sánchez- 
Corrales YE (2010) Flower development. 
Arabidopsis Book 8:e0127
 13. Pelaz S, Tapia-López R, Alvarez-Buylla ER, 
Yanofsky MF (2001) Conversion of leaves into 
petals in Arabidopsis. Curr Biol 11(3): 
182–184
 14. Barrio RÁ, Hernández-Machado A, Varea C, 
Romero-Arias JR, Alvarez-Buylla E (2010) 
Flower development as an interplay between 
dynamical physical fields and genetic networks. 
PLoS One 5(10):e13523
 15. Kauffman S (1969) Homeostasis and differen-
tiation in random genetic control networks. 
Nature 224:177–178
 16. Mendoza L, Thieffry D, Alvarez-Buylla ER 
(1999) Genetic control of flower morphogen-
esis in Arabidopsis thaliana: a logical analysis. 
Bioinformatics 15(7–8):593–606
 17. Chaos Á, Aldana M, Espinosa-Soto C et al 
(2006) From genes to flower patterns and 
evolution: dynamic models of gene regulatory 
networks. J Plant Growth Regul 
25(4):278–289
 18. Sanchez-Corrales YE, Alvarez-Buylla ER, 
Mendoza L (2010) The Arabidopsis thaliana 
flower organ specification gene regulatory net-
work determines a robust differentiation pro-
cess. J Theor Biol 264:971–983
 19. La Rota C, Chopard J, Das P, Paindavoine S, 
Rozier F, Farcot E, Godin C, Traas J, Monéger 
F (2011) A data-driven integrative model of 
sepal primordium polarity in Arabidopsis. Plant 
Cell 23(12):4318–4333
 20. Garg A, Mohanram K, De Micheli G, Xenarios 
I (2012) Implicit methods for qualitative mod-
eling of gene regulatory networks. Methods 
Mol Biol 786:397–443
 21. Alvarez J, Guli CL, Yu XH, Smyth DR (1992) 
terminal flower: a gene affecting inflorescence 
development in Arabidopsis thaliana. Plant J 
2(1):103–116
 22. Shannon S, Meeks-Wagner DR (1991) A 
mutation in the Arabidopsis TFL1 gene affects 
inflorescence meristem development. Plant Cell 
3(9):877–892
 23. Parcy F, Bomblies K, Weigel D (2002) 
Interaction of LEAFY, AGAMOUS and 
TERMINAL FLOWER1 in maintaining floral 
meristem identity in Arabidopsis. Development 
129(10):2519–2527
Eugenio Azpeitia et al.
469
 24. Conti L, Bradley D (2007) TERMINAL 
FLOWER1 is a mobile signal controlling 
Arabidopsis architecture. Plant Cell 19(3): 
767–778
 25. Chen L, Cheng JC, Castle L, Sung ZR (1997) 
EMF genes regulate Arabidopsis inflorescence 
development. Plant Cell 9(11):2011–2024
 26. Liljegren SJ, Gustafson-Brown C, Pinyopich A 
(1999) Interactions among APETALA1, 
LEAFY, and TERMINAL FLOWER1 specify 
meristem fate. Plant Cell 11(6):1007–1018
 27. Ratcliffe OJ, Bradley DJ, Coen ES (1999) 
Separation of shoot and floral identity in 
Arabidopsis. Development 126(6):1109–1120
 28. Gustafson-Brown C, Savidge B, Yanofsky MF 
(1994) Regulation of the Arabidopsis floral 
homeotic gene APETALA1. Cell 76(1): 
131–143
 29. Gómez-Mena C, de Folter S, Costa MMR, 
Angenent GC, Sablowski R (2005) 
Transcriptional program controlled by the flo-
ral homeotic gene agamous during early organ-
ogenesis. Development 132(3):429–438
 30. Kitano H (2007) Towards a theory of biologi-
cal robustness. Mol Syst Biol 3:137
 31. Whitacre JM (2012) Biological robustness: 
paradigms, mechanisms, and systems princi-
ples. Front Genet 3:67
 32. Garg A, Mohanram K, Di Cara A, De Micheli 
G, Xenarios I (2009) Modeling stochasticity 
and robustness in gene regulatory networks. 
Bioinformatics 25:i101–i109
 33. Samoilov MS, Price G, Arkin AP (2006) From 
fluctuations to phenotypes: the physiology of 
noise. Sci STKE 2006:re17
 34. Hoffmann M, Chang HH, Huang S, Ingber 
DE, Loeffler M, Galle J (2008) Noise-driven 
stem cell and progenitor population dynamics. 
PLoS One 3(8):e2922
 35. Eldar A, Elowitz MB (2010) Functional roles 
for noise in genetic circuits. Nature 
467(7312):167–173
 36. Balázsi G, van Oudenaarden A, Collins JJ 
(2011) Cellular decision making and biological 
noise: from microbes to mammals. Cell 
144(6):910–925
 37. Horsthemke W, Lefever R (1984) Noise- 
induced transitions: theory and applications in 
physics, chemistry, and biology. Springer, Berlin
 38. Chalancon G, Ravarani CNJ, Balaji S, 
Martinez-Arias A, Aravind L, Jothi R, Babu 
MM (2012) Interplay between gene expression 
noise and regulatory network architecture. 
Trends Genet 28(5):221–232
 39. Glass L (1975) Classification of biological 
networks by their qualitative dynamics. J Theor 
Biol 54:85–107
 40. Mendoza L, Xenarios I (2006) A method for 
the generation of standardized qualitative 
dynamical systems of regulatory networks. 
Theor Biol Med Model 3:13
 41. Ferrell JE Jr (2012) Bistability, bifurcations, 
and Waddington’s epigenetic landscape. Curr 
Biol 22:R458–R466
 42. Zhou JX, Brusch L, Huang S (2011) Predicting 
pancreas cell fate decisions and reprogramming 
with a hierarchical multi-attractor model. PLoS 
One 6(3):e14752
 43. Wang J, Zhang K, Xua L, Wang E (2011) 
Quantifying the Waddington landscape and 
biological paths for development and differen-
tiation. Proc Natl Acad Sci 108:8257–8262
 44. Zadeh LA (1965) Fuzzy sets. Inf Control 
8:338–353
 45. Arellano G, Argil J, Azpeitia E, Benítez M, 
Carrillo M, Góngora P, Rosenblueth DA, 
Alvarez-Buylla ER (2011) “Antelope”: a 
hybrid-logic model checker for branching-time 
Boolean GRN analysis. BMC Bioinformatics 
12:490
 46. Müssel C, Hopfensitz M, Kestler HA (2010) 
BoolNet—an R package for generation, 
 reconstruction and analysis of Boolean net-
works. Bioinformatics 26(10):1378–1380
 47. von Dassow G, Meir E, Munro EM, Odell GM 
(2000) The segment polarity network is a 
robust developmental module. Nature 
406(6792):188–192. doi:10.1038/35018085
 48. Naldi A, Berenguier D, Fauré A, Lopez F, 
Chaouiya C (2009) Logical modelling of regu-
latory networks with GINsim 2.3. Biosystems 
97(2):134–139
 49. Corblin F, Fanchon E, Trilling L (2010) 
Applications of a formal approach to decipher 
discrete genetic networks. BMC Bioinformatics 
11(1):385
 50. de Jong H, Geiselmann J, Hernandez C, Page 
M (2003) Genetic network analyzer: qualita-
tive simulation of genetic regulatory networks. 
Bioinformatics 19(3):336–344
 51. Calzone L, Fages F, Soliman S (2006) Biocham: 
an environment for modeling biological systems 
and formalizing experimental knowledge. 
Bioinformatics 22(14):1805–1807
 52. Azpeitia E, Benítez M, Padilla-Longoria P, 
Espinosa-Soto C, Alvarez-Buylla ER (2011) 
Dynamic network-based epistasis analysis: 
boolean examples. Front Plant Sci 2:92
Floral Gene Regulatory Network Models
455
Jose M. Alonso and Anna N. Stepanova (eds.), Plant Functional Genomics: Methods and Protocols, Methods in Molecular Biology,
vol. 1284, DOI 10.1007/978-1-4939-2444-8_23, © Springer Science+Business Media New York 2015
 Chapter 23 
 Descriptive vs. Mechanistic Network Models 
in Plant Development in the Post-Genomic Era 
 J.  Davila-Velderrain ,  J. C.  Martinez-Garcia , and  E. R.  Alvarez-Buylla 
 Abstract 
 Network modeling is now a widespread practice in systems biology, as well as in integrative genomics, and 
it constitutes a rich and diverse scientifi c research fi eld. A conceptually clear understanding of the reasoning 
behind the main existing modeling approaches, and their associated technical terminologies, is required to 
avoid confusions and accelerate the transition towards an undeniable necessary more quantitative, multi-
disciplinary approach to biology. Herein, we focus on two main network-based modeling approaches that 
are commonly used depending on the information available and the intended goals: inference-based meth-
ods and system dynamics approaches. As far as data-based network inference methods are concerned, they 
enable the discovery of potential functional infl uences among molecular components. On the other hand, 
experimentally grounded network dynamical models have been shown to be perfectly suited for the mech-
anistic study of developmental processes. How do these two perspectives relate to each other? In this 
chapter, we describe and compare both approaches and then apply them to a given specifi c developmental 
module. Along with the step-by-step practical implementation of each approach, we also focus on discuss-
ing their respective goals, utility, assumptions, and associated limitations. We use the gene regulatory 
network (GRN) involved in  Arabidopsis thaliana Root Stem Cell Niche patterning as our illustrative 
example. We show that descriptive models based on functional genomics data can provide important back-
ground information consistent with experimentally supported functional relationships integrated in mech-
anistic GRN models. The rationale of analysis and modeling can be applied to any other well-characterized 
functional developmental module in multicellular organisms, like plants and animals. 
 Key words  Gene regulatory networks ,  Root stem cell niche ,  Cell differentiation ,  Attractor , 
 Morphogenesis ,  System dynamics ,  Mathematical model ,  Computational simulation ,  Network inference , 
 Descriptive model ,  Mechanistic model 
1  Introduction 
 Mathematical modeling and computational modeling are becoming 
an indispensable scientifi c research practice in modern post- genomic 
biology. The term  systems biology has been coined to defi ne this 
new fi eld of study, highly characterized by its fuzzy disciplinary 
boundaries. The  systems perspective to biology embraces the notion 
of biological behavior as resulting from the collective action of 
456
multiple interacting components at different temporal and spatial 
scales and levels of organization. Collective behavior emerges from 
the component interactions themselves and not only from the 
specifi c function of the individual components of a given complex 
system. Multicellular development includes several such collective 
processes involving molecular genetic components that lead to cell 
growth, proliferation, and differentiation, and to the eventual 
emergence of spatial and temporal structural morphogenetic 
patterns. All these dynamical processes are, to a great extent, self-
organized and thus occur as unorchestrated choreographies that 
can be understood in terms of specifi c properties of networks or 
dynamical patterning modules of different nature [ 1 – 4 ]. The study 
of collective phenomena in biological systems, however, requires 
approaches that go beyond the discovery and description of indi-
vidual molecular components [ 5 ,  6 ]. Uncovering how dynamical 
behavior emerges and is robustly maintained, from the genetic and 
non-genetic components and their interactions, requires the use of 
mathematical/computational models [ 6 – 9 ]. In this chapter, we 
show how these formal tools enable the integration of molecular 
genetic data into network-based models. 
 The ongoing genomic revolution has been quite successful in 
uncovering a fairly complete set of molecular components at differ-
ent levels of regulation and for multiple organisms [ 10 – 13 ]. At the 
same time, developmental genetic studies have successfully charac-
terized sets of molecular regulators known to be tightly associated 
with specifi c developmental processes, and with the establishment 
of morphogenetic patterns [ 14 – 16 ]. In post-genomic biology there 
is an increasing need to transcend the reductionist modes of expla-
nation, to go beyond the traditional enumeration and the book-
keeping description of molecular processes and components, and 
to integrate this knowledge into explanatory models [ 5 ,  17 ,  18 ]. 
Towards this goal, we can distinguish two important questions: (1) 
given a set of known molecular players, how can we gain insights 
into their regulatory interactions; and (2) once a set of molecules 
and their interactions are known, how can we study the associated 
dynamic behavior and, ultimately, the phenotypic manifestation of 
such a molecular regulatory system. In this chapter, we show how 
to approach these questions within the context of the practical 
implementation of gene regulatory network (GRN) models. 
 GRN models are considered as one of the most powerful tools 
for the study of complex molecular systems [ 2 ,  4 ,  7 ]. A GRN is 
composed of a given set of molecular players (e.g., genes, proteins) 
and a given set of interactions among them, which represent regu-
latory infl uences. Then, for the case of GRNs, the question (1) 
above refers, more precisely, to the process of inferring these inter-
actions from some source of experimental data [ 19 – 21 ]. Question 
(2) above implies a mechanistic perspective: the use of additional 
information and assumptions about underlying processes driving 
J. Davila-Velderrain et al.
457
the dynamical behavior in order to simulate it and to uncover the 
consequences of the dynamical interplay [ 6 ]. From the modeling 
point of view, these two tasks are associated with two different 
approaches. (1) Data-based  descriptive models are used to postu-
late putative regulatory interactions among molecular players 
through the quantitative descriptions of the observed relationships 
among a set of measured variables. On the other hand, (2)  mecha-
nistic dynamical models are used to represent, in a quite useful 
simplifi ed manner, specifi c processes underlying cell behavior, 
using for this well-posed descriptive equations or computer-based 
encoded systems knowledge [ 22 ]. In the latter case, the resultant 
models enable the study of how cell behavior changes over time, as 
well as the long-term consequences of the underlying dynamical 
processes. 
 The descriptive ( statistical ) approach is commonly used as a 
way to make sense of large-scale genomic data [ 23 ,  24 ]. On the 
other hand, the mechanistic perspective is widely applied to small 
or moderate-order well-characterized biological processes [ 2 ]. 
Given that genome-scale networks are composed of multiple struc-
tural and functional modules [ 25 – 28 ], others and we have pro-
posed to use GRN models to discover robust modules and explore 
their dynamic behavior [ 29 – 31 ]. Following this line of research, in 
this chapter we contrast the descriptive and mechanistic approaches 
taking as an illustrative example a recently well-characterized GRN 
model: the GRN involved in  Arabidopsis thaliana Root Stem Cell 
Niche (SCN) developmental dynamics [ 32 ,  33 ]. Using this devel-
opmental module as an example, we show: (1) how a data-based, 
descriptive approach can be applied to propose putative gene inter-
actions that later can be included in mechanistic GRN models; (2) 
how a dynamical GRN model is constructed from published 
molecular experimental data; (3) the common steps followed in 
the dynamic analysis of a GRN mechanistic model; and (4) a com-
parison between the inferred descriptive GRN model and the well- 
characterized mechanistic dynamic GRN model. 
 Network modeling in post-genomic biology is a diverse practice. 
Different, well-established traditions exist within the mathematical 
and physical sciences, where terms and defi nitions are commonly 
adopted dependent on the context. In multidisciplinary fi elds such 
as systems biology and integrative genomics, however, such dis-
tinctions get blurred. The problem is particularly acute in molecu-
lar network modeling: computer scientists, statisticians, engineers, 
physicists, and mathematicians are all trying to approach the prob-
lem making important contributions [ 24 ,  34 – 36 ]. It is diffi cult to 
devise a consensus within such diversity. Aware of this problem, we 
start by conceptually distinguishing between the two general mod-
eling traditions, namely a  descriptive vs. a  mechanistic modeling 
approach. For each case, we defi ne key terminology to be used in 
1.1  Defi nitions
Descriptive vs. Mechanistic Network Models in Plant Development…
458
the sections that follow. Although we focus the discussion on GRN 
models, the comparison in this section concerns the general prac-
tice of mathematical/computational modeling. In particular, for 
each modeling perspective, we defi ne general modeling concepts 
such as  validation ,  prediction , and  explanation ( see Table  1 ).
 A descriptive model is a quantitative summary of the observed rela-
tionships among a set of measured variables [ 22 ]. In the case of 
GRN modeling, the variables commonly correspond to genes 
whose activity is measured by quantifying gene expression. 
Functional genomic data (e.g., microarray or next-generation 
sequencing (NGS) data) are commonly used as the set of measure-
ments [ 19 ]. 
 Goals : The main goal of descriptive, inferential approaches is to dis-
cover new  knowledge . In general, descriptive models aim at  fi nding 
1.1.1  Descriptive Models
 Table 1 
 General modeling concepts 
 Descriptive modeling 
 Model  A mathematical expression or computer algorithm that relates the values of one 
or more  responsive (dependent) variables with the values of a set of  predictor 
(independent) variables. 
 Prediction  Calculated values of the responsive variables by taking specifi c values of the 
predictor variables as input to the model. 
 Explanation  A predictor variable  x is said to  explain a responsive variable  y if the predicted 
values for  y are in agreement (to a certain degree) with the observed values in 
a particular dataset comprising empirical values of  x and  y . 
 Validation  The practice of testing the performance of a model by testing its predictive 
power using an independent dataset. 
 Causal attribution  It is not possible to postulate the reasons why a certain quantitative relationship 
embedded in the model is able to  explain one variable in terms of the other—
“ correlation does not imply causation .” 
 Mechanistic modeling 
 Model  Set of equations or computer code that describe how simplifi ed properties of a 
real-world entity (system) change over time as a result of specifi c underlying 
processes. 
 Prediction  Forecasting the future properties of the system or their long-term behavior. 
 Explanation  The processes considered in the model account for the observed system behavior. 
 Validation  The practice of contrasting model predictions with experimental observations of 
the real-world entity. 
 Causal attribution  The predicted behavior results from the underlying  causal processes considered 
in the model. The model is built by explicitly considering the processes that 
produce our observations. 
J. Davila-Velderrain et al.
459
novel hypotheses regarding the functional infl uences among 
molecular components amidst the mass of high-throughput 
data [ 23 ]. In the case of GRN models, this corresponds precisely 
to fi nding putative function-specifi c network nodes and edges. 
 Main assumptions : The reasoning is based on the idea that molecular 
components that share discernible patterns in high- throughput 
data sets also share experimentally testable biological (functional) 
relationships. 
 Main limitations : (1) A descriptive model says nothing about  why 
the variables are related the way they are ( see  Note 1 ). (2) We can 
only be confi dent that the relationships apply to the conditions 
(e.g., samples) where the data come from ( see  Note 2 ). It might 
apply to other conditions, for example to the same tissue, or even 
to other tissues, but it might not. 
 Conclusions to draw : The connected nodes (genes) show certain 
coordinated statistical activity through the sample conditions 
included in the data set ( see  Note 3 ). Subsets of molecules partici-
pating in similar biological processes, even if they do not have 
physical interactions, can be uncovered with these models. 
However, the observation of correlated behavior does not neces-
sarily imply a functional relationship (causalities are not always easy 
to discern). The results should be taken as one source of inconclu-
sive evidence—useful to be integrated with further analysis, none-
theless. We must point out that diverse applications that follow a 
descriptive approach have been integrated recently in the analysis 
of plant transcriptomes [ 37 ,  38 ]. 
 A mechanistic, dynamic model is a simplifi ed representation of 
some real-world entity, in terms of descriptive equations or 
computer- based encoded systems knowledge [ 22 ]. The model is 
called  dynamic because it describes how system properties change 
over time. A dynamic model is  mechanistic because it is built by 
explicitly considering the processes that produce our observations 
(i.e., the involved processes are considered in term of the workings 
of coupled individual components). Relationships between vari-
ables emerge from the model as the result of the underlying pro-
cesses. In the case of GRNs, the process of interest is developmental 
dynamics, i.e., the establishment of the patterns of cellular differ-
entiation and structural morphogenesis [ 4 ,  7 ]. 
 Goals : The main goal of the mechanistic approaches is scientifi c under-
standing [ 17 ,  22 ,  39 ]. More specifi cally, answering question such as: 
How do we create understanding out of validated bits of knowledge? 
Can processes A and B account for pattern C? Which of several con-
tending sets of assumptions is best able to account for the data? Given 
that processes A and B occur, what consequences do we expect to 
observe? Where are the holes in our understanding? ( see  Note 4 ). 
1.1.2  Mechanistic Model
Descriptive vs. Mechanistic Network Models in Plant Development…
460
 Main assumptions : In a mechanistic model, the postulated underlying 
processes, thought to be driving the system’s observed behavior, 
effectively constitute assumptions. These assumptions should 
refl ect the current state of domain knowledge. In the case of GRNs, 
it is generally postulated that the time-dependent behavior of the 
activity of each gene is driven by the coordinated behavior of the 
genes regulating it, which are in turn also subject to regulation. 
The overall result of such complex network of mutual regulatory 
interactions is a restrictive behavior: the present activity state of all 
the genes in the network, and the regulatory interactions among 
them, determine the future activity state. 
 Main limitations : (1) Identifying which state variables and processes 
are important for your modeling purposes is not trivial. Thus, the 
construction of mechanistic models is a time-consuming process. 
(2) The available knowledge upon which the model is constructed 
is essentially incomplete. In the case of GRNs, it is frequently the 
case that certain molecular players and key regulatory interactions 
have not been characterized by the time the model is constructed. 
But the GRN construction process and modeling is useful to identify 
and evaluate such gaps in experimental knowledge ( see  Note 5 ); 
this is one of the most important advantages of the system dynamics 
approach. 
 Conclusions to draw: The observed behavior is a direct conse-
quence of the underlying processes considered in the model. The 
observed behavior resulting from simulated interventions can 
constitute  predictions ( see Table  1 ). For example, in GRN dynami-
cal models the expression profi le represents or correlates with par-
ticular cellular phenotypes ( see Table  2 ). The modeled regulatory 
interactions restrict the permissible behavior of the time-changing 
expression profi le, and also determine the existence of certain sta-
ble, time-invariant expression profi les. Multiple studies have 
shown that these stable confi gurations correspond to those char-
acterized in several cell types but for which a mechanistic and 
dynamical explanation was lacking [ 2 ,  4 ,  30 ]. Therefore, stable 
cellular phenotypes, as described by gene expression profi les, 
result from the restrictions imposed by a given GRN. Furthermore, 
loss- or gain-of- function mutations can be easily simulated as con-
trolled interventions in the model. The effect of these simulated 
interventions on the observed stable expression profi les can be 
useful to validate the model derived from the considered wild-
type (wt) constraints, or can also constitute predictions subjected 
to experimental  validation ( see Table  1 ).
 A dynamic model is built up from descriptive equations represent-
ing the processes thought to account for the patterns observed in 
the given data, whereas a descriptive model only represents the 
patterns themselves. Do these two strategies have to be mutually 
1.1.3  Descriptive 
vs. Mechanistic
J. Davila-Velderrain et al.
461
exclusive? We consider that the integration of descriptive and 
mechanistic models is a promising, yet rarely applied, approach in 
post-genomic biology. Incomplete knowledge is a common limita-
tion for the postulation of mechanistic GRN models. On the other 
hand, the main goal of descriptive models is to uncover new knowl-
edge from high-throughput data, which is quite vast and increas-
ing in post-genomic biology. In our opinion, this distinction can 
be exploited in order to circumvent the limitations of each indi-
vidual approach. The  predictions ( see Table  1 for defi nitions) made 
by following a descriptive approach can be used as a source of 
knowledge to be integrated into a mechanistic model. In order for 
this suggested model integration strategy to be useful, however, 
the descriptive predictions should be accurate. How do we test if 
this is the case? We approach this issue in  Subheading  3.3 below. 
 In the following sections, we show how to apply both a descrip-
tive and a mechanistic modeling approach taking a well-defi ned 
regulatory module as a simple illustrative example. 
2  Materials 
 Arabidopsis thaliana Root Genome-wide GRN: In a recent study, 
Montes and collaborators applied network inference to publicly avail-
able  Arabidopsis thaliana root microarray samples [ 45 ]. They com-
piled a dataset of microarray samples from the EBI ArrayExpress 
database based on the following criteria: (1) include only experiments 
2.1  Descriptive 
Approach to GRN 
Modeling
2.1.1  Data
 Table 2 
 GRN dynamical model concepts 
 Concept  Defi nition 
 Node  Representation of a molecular species (gene, protein, etc.). 
 Edge  Representation of a given regulatory interaction. 
 Node state (variable)  Expression value that a node takes at a certain time. 
 Network state  Ordered set of node expression values at a certain time. 
 State space  Set comprising all possible network states. 
 Attractor  Stable and stationary (time-invariant) network states. 
 Transitory state  Network states that are not (do not form part) of an 
attractor (attractor’s basin). 
 Basin of attraction  Set comprising all the initial network states that eventually 
lead to a particular attractor. 
 Biologically observable attractor  Gene expression profi les (gene confi gurations) that have 
been obtained from experimental assays and reported in 
the scientifi c literature for particular cell types. 
Descriptive vs. Mechanistic Network Models in Plant Development…
462
using the Affymetrix GeneChip ATH1-121501, (2) include only data 
corresponding to root tissues, (3) exclude samples from ecotypes 
other than Columbia-0, and (4) exclude transgenic samples (mutant 
and overexpression lines, and promoter constructs). The fi nal dataset 
consists of 656 microarray samples. The raw microarray data was pre-
processed using the R package  gcrm to obtain the expression matrix 
(for details,  see ref.  45 ). For illustration purposes, here we use this 
dataset for the inference exercises. All the inferences shown below are 
based on the data extracted from this microarray expression matrix. 
 Arabidopsis thaliana Root Stem Cell Niche (SCN) GRN: In an 
attempt to explain the robust patterning of the root SCN of 
 Arabidopsis thaliana in terms of the dynamics of known molecular 
regulators, Azpeitia and collaborators recently postulated several 
GRN dynamical models [ 32 ]. The models are grounded on experi-
mental evidence of the interactions among the main molecular regu-
lators of root SCN patterning. We take this prior experimental 
information as the basis for the models developed in this chapter. In 
order to have a direct comparison between the inferred (descriptive) 
and the dynamical (mechanistic) GRN models, we extract from the 
dataset of Montes and collaborators [ 45 ] only the expression data 
corresponding to the set of molecular regulators considered by 
Azpeitia and collaborators [ 32 ]. In Table  3 , we show a summary of 
the supporting experimental evidence. We consider these character-
ized interactions as the “real” interactions set, against which all the 
inferences would be tested. Accordingly, from the complete expres-
sion matrix ( see Subheading  2.1.1 ) we extracted only the rows cor-
responding to the set of genes involved in the “real” interactions set. 
All the inferences are based on this smaller expression matrix.
 Correlation calculations: R statistical programming environment 
( www.R-project.org ). 
 Mutual information based inference:  minet , R package [ 47 ]. 
 Network visualization: R package  Rgraphviz [ 48 ]. 
 We take the experimental data in Table  3 as the basis to defi ne the list 
of state variables (genes) and the corresponding set of Boolean rules. 
 Experimental expression profi les (expected attractors) are 
extracted from ref.  32 . 
 Mutant phenotypes are extracted from ref.  32 . 
 BoolNet , R package [ 60 ]. 
 PPC-based co-expression network (Fig.  1 ). 
 MI-based co-expression networks (Fig.  2 ). 
 “Real” network (Fig.  3 ). 
 minet, R package [ 47 ]. 
2.1.2  Software
2.2  Mechanistic 
Approach to GRN 
Modeling
2.2.1  Data
2.2.2  Software
2.3  Inference 
Performance
2.3.1  Data
2.3.2  Software
J. Davila-Velderrain et al.
463
3  Methods 
 The practice of inference within systems biology is commonly asso-
ciated with terms such as reverse engineering [ 37 ,  21 ], data-driven 
modeling [ 40 ], or network learning [ 41 ]. Here we refer to all 
these practices as  descriptive modeling, as they rely on fi nding 
3.1  Descriptive 
Approach to GRN 
Modeling
 Table 3 
 Experimentally supported (real) interactions set 
 Interactions  Experimental evidence 
 SHR → SCR  The expression of  SCR is reduced in  shr mutants. ChIP-QRTPCR experiments 
show that SHR directly binds in vivo to the regulatory sequences of  SCR and 
positively regulates its transcription. 
 SCR → SCR  In the  scr mutant background, promoter activity of  SCR is absent in the QC and 
CEI. A ChIP-PCR assay confi rmed that SCR directly binds to its own 
promoter and directs its own expression. 
 JKD → SCR  SCR mRNA expression as probed with a reporter lines is lost in the QC and CEI 
cells in  jkd mutants from the early heart stage onward. 
 MGP–|SCR  The double mutant  jkd mgp rescues the expression of  SCR in the QC and CEI, 
which is lost in the  jkd single mutant. 
 SHR → MGP  The expression of  MGP is severely reduced in the  shr background. Experimental 
data using various approaches have suggested that  MGP is a direct target of 
SHR. This result was later confi rmed by ChIP-PCR. 
 SCR → MGP  SCR directly binds to the  MGP promoter, and  MGP expression is reduced in the 
 scr mutant background. 
 SHR → JKD  The post-embryonic expression of  JKD is reduced in  shr mutant roots. 
 SCR → JKD  The post-embryonic expression of  JKD is reduced in  scr mutant roots. 
 SCR → WOX5  WOX5 is not expressed in  scr mutants. 
 SHR → WOX5  WOX5 expression is reduced in  shr mutants. 
 ARF(MP) → WOX5  WOX5 expression is rarely detected in  mp or  bdl mutants. 
 ARF → PLT  PLT1 mRNA region of expression is reduced in multiple mutants of  PIN genes, 
and it is overexpressed under ectopic auxin addition.  PLT1 and  2 mRNAs are 
absent in the majority of  mp embryos and even more so in  mp nph4 double 
mutant embryos. 
 Aux/IAA–|ARF  Overexpression of  Aux/IAA genes represses the expression of  DR5 both in the 
presence and absence of auxin. Domains III and IV of Aux/IAA proteins 
interact with domains III and IV of ARF stabilizing the dimerization that 
represses ARF transcriptional activity. 
 Auxin–|Aux/IAA  Auxin application destabilizes Aux/IAA proteins. 
 Aux/IAA proteins are targets of ubiquitin-mediated auxin-dependent 
degradation. 
 CLE40–|WOX5  Wild-type root treated with CLE40p show a reduction of  WOX5 expression, 
whereas in  cle40 loss-of-function plants  WOX5 is overexpressed. 
Descriptive vs. Mechanistic Network Models in Plant Development…
464
 statistical patterns in the genomic data either at the DNA, mRNA, 
protein, or metabolic level. Importantly, we do not include here 
the problem of inferring parameters of mechanistic models from 
data [ 42 ], a practice that may be diffi cult to classify under the 
scheme we chose. Multiple statistical models are currently used for 
network inference purposes [ 20 ]. Here we focus exclusively on 
those models that have been most widely used in plant genomics 
and systems biology, namely, co-expression networks based on 
either (1) pair-wise correlation [ 43 ,  44 ], or (2) mutual information 
criteria [ 45 ,  46 ]. Inference of GRNs by estimating statistical pat-
terns of co-expression is a widely used practice [ 2 ,  20 ]. 
 Comparing expression patterns between genes is the basis for con-
structing a co-expression network [ 49 ]. A straightforward defi ni-
tion of a gene co-expression network is a network in which an edge 
between a given node, say A, and a related node, say B, is added if 
some measure of similarity between the expression profi les of gene 
A and gene B exceeds some threshold value, although more strin-
gent algorithms exist (see below). One of the most simple and 
widely used measures of similarity for network construction is the 
3.1.1  Pairwise 
Correlation Co-expression 
Network
 Fig. 1  PPC-based inferred GRN. The  graph shows the inferred gene interactions 
among the molecular players included in Table  3 . Only those interactions involv-
ing a PPC value equal or greater than 0.3 were included in the network ( see  Note 
6 ). The inferred GRN qualitatively resembles the real, experimentally supported 
GRN ( see Fig.  4 ) 
 
J. Davila-Velderrain et al.
465
Pearson correlation coeffi cient (PCC) [ 50 ]. This quite useful 
approach has been applied several times in plant genomic studies 
using different expression datasets, and mostly for the analysis of 
genome-scale networks ( see , for example refs.  43 ,  51 ,  52 ). 
 A generic protocol to construct a PPC-based co-expression 
network for the genes involved in the experimental data summa-
rized in Table  3 would be as follows:
  1.  A matrix with numbers representing gene expression values is 
required. In this matrix rows correspond to the genes of inter-
est to be integrated in the network. Columns correspond to 
the samples where gene expression was measured. We refer to 
such a matrix as the  expression matrix . Here, for illustration, we 
use a data matrix extracted from [ 45 ] which corresponds to 
expression data of the genes summarized in Table  3 (we 
excluded  WOX5 , as it does not have a unique Affymetrix 
microarray identifi er). 
 Fig. 2  MI-based inferred GRNs. Graphs of the MI-based inferred GRNs corre-
sponding to each of the algorithms were implemented in the package  minet. The 
inferred GRNs are in general more connected than the one based on PPC infer-
ence.  CLE40 is a molecular player that was hypothesized to be interacting with 
 WOX5 (not included because of lack of expression data).  WOX5 in turn interacts 
with  SCR, SHR , and  ARF ( see Azpeitia et al. [ 32 ]). Interestingly, the  mrnet al go-
rithm, which has been shown to perform better than other MI-based algorithms, 
uncovered co-expression interactions between  CLE40 and the interacting part-
ners of  WOX5 
 
Descriptive vs. Mechanistic Network Models in Plant Development…
mrnet H aracne elr 
466
  2.  Given the expression matrix, Pearson correlation coeffi cient 
(PCC) values are calculated between pairs of rows (i.e., 
 expression profi les). The function  cor implemented in the R 
statistical programming environment can be used for this pur-
pose. Specifi cally, the expression matrix is given as input to the 
 cor function and it automatically calculates PCC values between 
all possible pairs of rows retrieving a  correlation matrix , i.e., a 
matrix whose element  i,j represents the PCC value between 
genes  i and  j . 
  3.  Given the correlation matrix, an edge is defi ned between the 
genes  i and  j if the PCC value between them is greater or equal 
to user-specifi ed threshold value. The complete co-expression 
network results from defi ning all gene pairs fulfi lling the 
requirement ( see  Note 6 ). 
  4.  The co-expression network can be plotted using the R package 
 Rgraphviz using as input a list of the edges defi ned to be 
included in the network. 
 A very popular inferential approach is based on applying 
 well- established tools from standard information theory [ 2 ,  21 , 
 53 ,  54 ]. Interactions in these types of inferred co-expression 
3.1.2  Mutual Information 
Network Inference
 Fig. 3  Obtained attractors of the root SCN GRN. The GRN recovered four fi xed-point attractors corresponding 
to the Root SCN patterning cell types: quiescent center (QC), vascular initials, Cortex–Endodermis initials (CEI), 
and columella–epidermis–lateral root cap initials (CepI). In the graph,  green color indicates expression or gene 
activation (1), while  red color indicates no expression or inactivation (0) 
 
J. Davila-Velderrain et al.
Altr.Cloro wllh 1 0"'.(0) 
WOX5 
MGP 
JKD 
SHA 
Auxin 
AAF 
PlT 
seA 
Cepl Vascular CEI QC 
1·- ·-1 
467
 networks represent a high-degree of statistical dependence between 
gene expression profi les. These dependencies are typically mea-
sured by mutual information (MI) [ 47 ]. The adoption of mutual 
information in network inference is said to circumvent some of the 
limitations of PPC-based approaches ( see  Note 7 ). Recent studies 
have shown the utility of MI-based co-expression network infer-
ences for uncovering biological knowledge from plant transcrip-
tomes [ 45 ,  46 ]. Several tools are available for direct implementation 
of MI-based inferences [ 47 ,  55 ,  56 ]. 
 Given gene expression data in the form of a gene expression 
matrix ( see  Subheading  2.1.1 ), the inference of a MI-based co- 
expression network consists of two main steps, (1) MI computa-
tion and (2) network inference. Thus, a generic protocol infers 
interactions among Root SCN regulators using the R package 
 minet [ 47 ], as follows:
 1.  MI computation: pairwise MI calculations are performed in 
order to obtain a mutual information matrix (MIM). The func-
tion  build.mim from the  minet package can be used for this 
purpose. 
  2.  Network inference: based on the calculated MIM, one of 
 several algorithms is used to select which interactions are 
included (excluded) to produce a fi nal network. The simplest 
approach is to choose a threshold MI value, as it was done with 
the PPC- based network above. However, the  minet package 
implements three different algorithms that go beyond the 
threshold approach in an attempt to reduce the likelihood of 
inferring indirect interactions, i.e., situations where, for exam-
ple, a MI value between A and B is high because a third gene 
C is regulating both A and B ( see ref.  54 for details). The three 
algorithms are CLR, ARACNE, and MRNET, and these can 
be implemented by the respective functions  clr ,  aracne , and 
 mrnet using the previously calculated MIM as input. 
  3.  Steps 1 and  2 can be applied sequentially using the main func-
tion  minet() . This function implements sequentially all the 
steps required for the inference, starting directly from the 
expression matrix and taking the user-selected algorithms as 
arguments. 
 We applied the protocols described above to obtain one PPC- 
based (Fig.  1 ) and three MI-based co-expression networks (Fig.  2 ). 
Importantly, in co-expression networks auto-regulatory interactions 
are not considered, nor is the directionality of each interaction. 
 Dynamic models are diverse, among other things, in terms of the 
mathematical setting of the model (continuous or discrete time 
and model variables, deterministic or stochastic, etc.). For simplic-
ity, here we focus on discrete time and discrete state, deterministic 
3.2  Mechanistic 
Approach to GRN 
Modeling
Descriptive vs. Mechanistic Network Models in Plant Development…
468
dynamic models. The most widely used GRN model of this type is 
the Boolean network model [ 29 ,  57 ,  58 ]. The extension of that 
dynamic model into more complex models, as well as a more 
detailed exposition of their analyses, has been reviewed recently by 
the authors ( see refs.  7 ,  59 ). A dynamic GRN Boolean model has 
two essential components:
  1.  A short list of state variables (genes) that are taken to be suffi -
cient for summarizing the properties of interest in the develop-
mental system, and predicting how those properties will change 
over time. In a Boolean GRN the variables can only attain one 
of two possible values:  1 if the node is  ON , and  0 if the node is 
 OFF . A  0 node value represents that a gene is not being 
expressed, while a  1 node value represents that a gene is 
expressed. These are combined into a  state vector (in simple 
terms: a vector is an ordered list of numbers) ( see Table  2 for 
defi nitions). 
  2.  The dynamic equations: a set of equations (or rules) specifying 
how the state variables change over time, as a function of the 
current and past values of the state variables (we say that the 
concerned system is causal and not memory less). In a Boolean 
model these rules are specifi ed in terms of logical propositions 
or truth tables (see below). 
 Thus, a generic protocol to postulate a GRN model for a par-
ticular developmental module would be as follows:
  1.  Defi ne the list of state variables (genes): based on available 
experimental data, select the set of potential nodes or molecu-
lar components that will be incorporated in the GRN model. 
  2.  Defi ne the dynamic equations: collect statements on well- 
established gene dependencies from literature and express 
them as Boolean rules or truth tables. 
  3.  Defi ne the “expected attractors”: integrate in a Boolean vector 
the observed expression profi les of the cell-types of interest 
corresponding to the developmental system being modeled. 
For this, experimental data concerning the spatiotemporal 
expression patterns of the genes to be incorporated in the 
model can be used. 
  4.  Perform a dynamic analysis of the defi ned GRN model defi ned in 
 steps 1 and  2 using a computer-based simulation tool. Identify 
the stable gene confi gurations (“simulated attractors”). 
  5.  Compare the simulated attractors to the ones observed experi-
mentally (expected attractors;  see  step 3 above) ( see  Note 8 ). 
  6.  Validate the model by addressing if it recovers the wild-type 
and mutant (loss- and gain-of-function) gene activation con-
fi gurations that characterize the cells being considered. 
J. Davila-Velderrain et al.
469
 In the following section, we show a practical implementation 
of this general protocol using the Arabidopsis root SCN GRN as a 
simple illustrative example. 
 Defi ne the list of state variables (genes): Through an exhaustive 
review of literature, Azpeitia and collaborators identifi ed the set of 
molecules included in Table  3 as potential members of a develop-
mental module [ 32 ]. This set is taken as the list of state variables 
for the GRN Boolean model ( see Table  4 ).
 Defi ne Boolean rules: A major advantage of Boolean networks is the 
fact that natural-language statements can easily be transferred into 
Boolean representation. The discrete-time Boolean formalism is 
useful to postulate the set of components and interactions that are 
necessary and suffi cient to recover a particular observed multivari-
able state (for example, a gene expression confi guration). The same 
logic can be used as well to integrate both molecular genetic and 
non-genetic components, for example: the effect of mechanical 
forces, geometric constraints, or chemical components [ 8 ,  9 ]. 
Here we illustrate this process taking as an example the experimen-
tal evidence regarding the functional relationships between the 
genes SCR and SHR ( see Table  3 ). 
 Natural-language statement 1: 
 “ The expression of SCR is reduced in shr mutants. ChIP- QRTPCR 
experiments show that SHR directly binds in vivo to the regulatory 
sequences of SCR and positively regulates its transcription. ” 
3.2.1  Mechanistic 
Modeling of Arabidopsis 
Root SCN GRN
 Table 4 
 Boolean GRN model 
 List of state variables 
 X = [SCR, PLT, ARF, Aux, Auxin, SHR, JKD, MGP, WOX5] 
 Boolean functions 
 SCR = SHR & SCR 
 PLT = ARF 
 ARF = !Aux 
 Aux = !Auxin 
 Auxin = !Auxin|Auxin 
 SHR = SHR 
 JKD = SHR & SCR 
 MGP = SHR & SCR & !WOX5 
 WOX5 = ARF & SHR & SCR & !(MGP & !WOX5) 
Descriptive vs. Mechanistic Network Models in Plant Development…
470
 Transforming this into a Boolean rule is rather simple: 
 SCR value after transition depends on SHR, and its value is 
reduced if SHR is reduced. 
 Thus, the corresponding transition rule is
 SCR SHR=   
Natural-language statement 2: 
 “ In the scr mutant background promoter activity of SCR is 
absent in the Root SCN patterning cell types quiescent center (QC) 
and Cortex-Endodermis initials (CEI). A ChIP-PCR assay con-
fi rmed that SCR directly binds to its own promoter and directs its 
own expression. ” 
 SCR value after transition depends also on itself, and its pro-
moter activity is reduced if SCR is reduced. 
 Thus, the transition rule is
 SCR SCR=   
In both cases, the regulatory infl uence is positive. Taken both rules 
together we obtain the rule:
 SCR SHR SCR= &   
where & represents the AND operator. The rule means that SCR 
will be expressed in the future time step if both SHR and SCR are 
expressed in the present time step. 
 Following this intuitive transformation process from natural- 
language statements into Boolean rules or truth tables, one rule for 
each gene can be postulated. The set of genes with their corre-
sponding Boolean rules completely specifi es the Boolean GRN 
( see Table  4 ). 
 Defi ne the “expected attractors”: Azpeitia and collaborators defi ned 
four cell-type expression profi les based on spatiotemporal experi-
mental data from literature sources ( see Table  5 ). These profi les are 
taken as the set of “expected attractors”, which the model is expected 
to recover dynamically as a result of the restrictions imposed by the 
regulatory interactions encoded in the Boolean rules. Hence such 
modeling approach enables a mechanistic and dynamical explana-
tion for the observed gene expression confi gurations.
 Analyze GRN model dynamics: Once the set of Boolean rules is 
specifi ed, these can be loaded directly into the  BoolNet R package 
( see  Note 9 ). This software is able to read in networks consisting of 
such rule sets, as specifi ed in Table  5 , in a standardized text fi le 
format ( see ref.  60 ). Attractors are stable cycles of states in a Boolean 
network. As they comprise the states in which the network resides 
most of the time, attractors in models of GRNs developmental 
modules are expected to correspond to cellular phenotypes (cell- 
J. Davila-Velderrain et al.
471
type specifi c expression profi les). The  BoolNet package is able to 
identify attractors through the function  getAttractors() . This func-
tion incorporates several methods for the identifi cation of attrac-
tors, using as default an exhaustive synchronous search strategy. 
The identifi ed attractors can then be plotted using the function 
 plotAttractors(). We applied these functions to the Root SCN 
GRN and identifi ed four attractors ( see Fig.  3 ). 
 Comparison of simulated and observed/expected attractors: As 
expected, the simulated attractors uncovered by the GRN model 
dynamics ( see Fig.  3 ) correspond with the “expected attractors” 
defi ned by experimental data ( see Table  5 ). This suggests that cell- 
type specifi cation patterns in the root SCN result from the restric-
tions imposed by the uncovered GRN developmental module. 
Defi ning the expected set of attractors is an indispensable step 
when building the GRN model, because they are used to validate 
the GRN. However, it should be clear that the postulation of the 
Boolean functions is an independent task and, hence, it does not 
imply circularity. 
 Simulations of mutant gene knockout and overexpression confi gura-
tions: For validation purposes, it is straightforward to implement 
knockout and overexpression simulation experiments within the 
 BoolNet package. Specifi cally, genes can be set to a fi xed value (0 
for knockout, and 1 for overexpression), and in any calculation on 
the network this fi xed value is taken instead of the value of the cor-
responding transition function. The function  fi xGenes() takes as 
input the network, the name of the gene to be perturbed, and the 
value to be fi xed (0 or 1). Then all the other analysis, such as attrac-
tors’ identifi cation, can be performed over this new perturbed net-
work. Azpeitia and collaborators followed this approach and 
showed that most predicted alterations to the stable confi gurations 
caused by mutant simulations where consistent with known empir-
ical observations [ 32 ]. This validates the uncovered dynamical 
module or set of restrictions as necessary and suffi cient to explain 
the observed gene expression confi gurations. 
 Table 5 
 Gene expression profi les (expected attractors) 
 Cell type  PLT  Auxin  ARF  Aux/IAA  SHR  SCR  JKD  MGP  WOX5 
 QC  1  1  1  0  1  1  1  0  1 
 Vascular initials  1  1  1  0  1  0  0  0  0 
 CEI  1  1  1  0  1  1  1  1  0 
 Cepl  1  1  1  0  0  0  0  0  0 
Descriptive vs. Mechanistic Network Models in Plant Development…
472
 In the previous sections, we fi rst applied a descriptive approach to 
GRN modeling in order to infer GRN interactions from gene 
expression data. As a result, we constructed four inferred GRNs 
(Figs.  1 and  2 ). We then described the assemblage and analysis of 
an experimentally grounded GRN mechanistic model. In this sec-
tion, we show how to assess the different network inference algo-
rithms. We are interested in knowing if the inferred interactions are 
consistent with the ones defi ned based on published molecular 
functional experimental data. Once a “true” network is defi ned, 
there exist well-established tools to assess the performance of the 
inference algorithms. In this section, we take as a “true” network 
the one based on well-curated functional molecular genetic data 
and call it the mechanistic SCN GRN model that integrates the 
interactions summarized in Table  3 . The model is shown in Fig.  4 . 
In this section, we show how to assess the algorithms implemented 
in the descriptive modeling section using a common graphical tool: 
the ROC curve ( see  Note 10 ). 
 An interaction predicted by the algorithm is considered as a true 
positive (TP) or as a false positive (FP) depending on the presence 
or not of the corresponding interaction in the underlying “true” 
network, respectively. Analogously, the prediction of the absence 
of an interaction is considered as a true negative (TN) or a false 
negative (FN) depending on whether the corresponding edge is 
present or not in the underlying true network, respectively. Since 
GRN inference algorithms use a threshold value in order to defi ne 
3.3  Inference 
Performance
3.3.1  ROC Curves
 Fig. 4  “Real” root SCN GRN. The graph shows one of the single-cell Root SCN 
GRNs proposed in Azpeitia et al. [ 32 ]. The GRN is based on the experimental 
evidence summarized in Table  3 , and it represents graphically the information 
encoded in the logical statements shown in Table  4 
 
J. Davila-Velderrain et al.
473
which edges are not included in the fi nal network, the previous 
values (TP, FP, TN, and FN) can be calculated for each threshold 
value. 
 Using these defi nitions, two performance metrics can be calcu-
lated: the false positive rate, defi ned as FPR = FP/(TN + FP), and 
the true positive rate (sensitivity), TPR = TP/(TP + FN). The ROC 
curve is a commonly used graphical analysis in which the TPR 
(true positive rate) vs. FPR (false positive rate) are plotted for an 
inference algorithm as the threshold value is varied. A perfect infer-
ence algorithm would yield a point in the upper left corner of the 
ROC space, representing 100 % TPR (all true positives are found) 
and 0 % FPR (no false positives are found). Accordingly, points 
above the diagonal line indicate good inference results, while 
points below the line indicate wrong results. 
 A generic protocol to measure GRN Inference performance by 
means of a ROC curve analysis would be as follows:
  1.  Represent the inferred  n genes network as an  n ×  n adjacency 
matrix, where the cell  i,j contains the value of similarity metric 
(PPC or MI) between the expression profi les of the genes  i and  j: 
both  cor and  minet functions return such a matrix 
( see  Subheading  3.1 ). 
  2.  Defi ne an adjacency matrix for the “real” interactions, where 
the cell  i,j contains 1(0) indicating the presence (absence) of 
experimentally supported interaction. 
  3.  Use the function  validate() , which takes as arguments the 
inferred and the real networks (in matrix form) and calculates 
the metrics TP, FP, TN, and FN ( see Subheading  3.3.1 ) for dif-
ferent threshold values. 
  4.  Measure the accuracy of each algorithm by calculating the area 
under the ROC curve using the function  auc.roc of the pack-
age  minet . 
 We applied the previous protocol to compare each of the inferred 
networks with the “real” experimentally supported network. 
Figure  5 shows the ROC curves for the four comparisons. The 
methods PPC and MRNET show a better performance, given that 
their curves (points) are closer to the top-left corner (perfect infer-
ence) than those of other methods. Table  6  shows the calculated 
AUC values. Interestingly, the simple PPC-based inference showed 
the highest accuracy, while the method ARACNE showed the low-
est ( see  Note 11 ). Overall, the inference method shows a good per-
formance (AUC > 8.3), with the exception of ARACNE. This 
suggests that inferred interactions from curated expression data set 
as the one assembled in [ 45 ] provide important background infor-
mation consistent with experimentally supported functional rela-
tionships, at least for the module analyzed here. 
Descriptive vs. Mechanistic Network Models in Plant Development…
474
4  Notes 
  1.  Correlation does not imply causation [ 61 ]. If two variables, A 
and B, are correlated with high statistical signifi cance, it does 
not necessarily imply that A causes B (nor that B causes A). 
  2.  Dataset selection is an important part in inference approaches. 
Finding or not interactions among variables directly depends 
on the statistical properties of the data. Depending on the 
goals of the study, one could choose to integrate a comprehen-
sive large and heterogeneous dataset [ 46 ], or a smaller one 
based on certain selection criteria [ 45 ]. The results will likely 
vary depending on the dataset, even when using the exact same 
inference algorithm. The same is true for the performance of 
the different algorithms (see below). 
  3.  Importantly, an edge in an inferred co-expression network 
does not imply a physical interaction or a direct regulatory 
infl uence. It is assumed that genes that are co-expressed across 
conditions are likely to share a common function, or to be 
 Fig. 5  ROC Curves for inference algorithms. The  graph shows a comparison of 
the performance of each of the inference algorithms used herein. For each of the 
four algorithms, a ROC curve is plotted. Most of the points appear above the 
 diagonal line indicating a general good inference performance. The curves that 
reach a higher TP rate while having low or null FP rate outperform the other. In 
this case:  clr ,  mrnet , and  PPC outperform  aracne 
 Table 6 
 Area under the (ROC) curve (AUC) values 
 PPC  CLR  ARACNE  MRNET 
 AUC  0.8355856  0.8333333  0.6869369  0.8310811 
 
J. Davila-Velderrain et al.
475
involved in similar biological processes [ 62 ]. This  functional 
relationship does not imply a direct functional dependence 
between the corresponding molecules. 
  4.  At fi rst sight, from a mechanistic point of view, the entire 
notion of validating or invalidating models may seem mis-
guided [ 17 ]. Models are valuable in science not because they 
can be validated, but because they can be useful for improving 
our understanding of a given observed phenomenon. Models 
may be found inconsistent with a set of data, but that does not 
necessarily rob them of their utility. The consequences of a 
specifi c set of assumptions included as underlying processes in 
a mechanistic model do not depend on the available experi-
mental data, nor on a validation process. Thus, a mechanistic 
model is always a well-suited tool to address questions regard-
ing such assumptions [ 63 ]. 
  5.  In the case of incomplete or uncertain prior knowledge about 
the system being modeled, a single model may be less useful 
than a set of models representing different hypotheses. Instead 
of having to decide if a specifi c model fi ts the data, which is 
hard and subjective, one can test which model fi ts the data 
best, which is easier and more objective [ 22 ]. In this way, puta-
tive interaction of functional relationships between genes can 
be postulated as hypotheses in the form of different GRN 
models. Each model can be tested against the observations 
(e.g., expected expression profi les) and in this way address 
which set of hypotheses fi ts better. 
  6.  A link is established by an edge between two genes, repre-
sented by nodes, if the PCC value is higher or equal to an 
arbitrary cutoff that can be adjusted depending on the data-
set used. In the present case, we chose the greatest value pro-
ducing a fully connected network (a network where all the 
nodes have at least one edge). The chosen value was 0.3. In 
this case, such a small value is associated with the fact of hav-
ing a fairly homogeneous dataset: only samples from a single 
tissue (root) and under wild-type conditions. Even in this 
case, the PCC- based inference showed good performance 
( see  Subheading  3.3 ). 
  7.  Unlike PPC, MI is not restricted to the identifi cation of linear 
relations between the random variables, and is used as an 
approach to eliminate the majority of indirect interactions 
inferred by co-expression methods [ 47 ,  55 ]. 
  8.  A perfect coincidence would suggest that a suffi cient set of 
molecular components (nodes) and a fairly correct set of inter-
actions have been considered in the postulated GRN model. If 
this is not the case, additional components and interactions can 
be incorporated or postulated, or the Boolean functions can be 
Descriptive vs. Mechanistic Network Models in Plant Development…
476
  1.  Forgacs G, Newman SA (2005) Biological 
physics of the developing embryo. Cambridge 
University Press, Cambridge 
  2.  Alvarez-Buylla ER, Benítez M, Dávila EB et al 
(2007) Gene regulatory network models for plant 
development. Curr Opin Plant Biol 10(1):83–91 
  3.  Huang S, Kauffman S (2009) Complex gene 
regulatory networks—from structure to bio-
logical observables: cell fate determination. In: 
Meyers RA (ed) Encyclopedia of complexity 
and systems science. Springer, Heidelberg, 
pp 1180–1213 
  4.  Alvarez-Buylla ER, Azpeitia E, Barrio R, 
Benítez M, Padilla-Longoria P (2010) From 
ABC genes to regulatory networks, epigenetic 
landscapes and fl ower morphogenesis: making 
modifi ed. This allows to refi ne interpretations of experimental 
data or to postulate novel interactions to be tested experimen-
tally in the future. In any case, the process can be repeated 
several times based on the dynamical behavior of the modifi ed 
versions of the GRN under study until a regulatory module is 
postulated. Such module can include some novel hypothetical 
interactions or components, integrate available experimental 
data, and identify possible experimental contradictions or gaps. 
  9.  There are several free software packages for the dynamic analysis 
of Boolean GRNs, including: ANTELOPE  [ 64 ], GINSIM [ 65 ], 
BoolNet [ 60 ], GNbox [ 66 ], GNA [ 67 ], and BioCham [ 68 ]. 
  10.  The performance of the inference algorithms heavily relies on 
the dataset used. There is no best algorithm for all cases. We 
showed that the simplest, most criticized algorithm (PPC- 
based inference) showed the best performance in the case ana-
lyzed here. 
  11.  There are other tools to test the performance of inference 
algorithms. ROC curves can present an overly optimistic view 
of an algorithm’s performance if there is a large skew in the 
types of interactions present in the true network (true and false 
interactions). This situation is common in GRN network infer-
ence because of sparseness. To tackle this problem, precision–
recall (PR) curves can be used ( see ref.  47 ). 
 Acknowledgments 
 J.D.V acknowledges the support of CONACYT and the Centre for 
Genomic Regulation (CRG), Barcelona, Spain; while spending a 
research visit in the lab of Stephan Ossowski. This chapter consti-
tutes a partial fulfi llment of the graduate program Doctorado en 
Ciencias Biomédicas of the Universidad Nacional Autónoma de 
México, UNAM in which J.D.V. developed this project. This work 
was supported by grants CONACYT 180098, 180380, 167705, 
152649 and UNAM-DGAPA-PAPIIT: IN203113, IN 203214, 
IN203814, UC Mexus ECO-IE415. The authors acknowledge 
logistical and administrative help of Diana Romo. 
 References 
J. Davila-Velderrain et al.
477
biological sense of theoretical approaches. 
Semin Cell Dev Biol 21(1):108–117 
  5.  Kaneko K (2006) Life: an introduction to com-
plex systems biology. Springer, New York 
  6.  Azpeitia E, Alvarez-Buylla ER (2012) A com-
plex systems approach to Arabidopsis root 
stem-cell niche developmental mechanisms: 
from molecules, to networks, to morphogene-
sis. Plant Mol Biol 80(4–5):351–363 
  7.  Azpeitia E, Davila-Velderrain J, Villarreal C 
et al (2014) Gene regulatory network models 
for fl oral organ determination. In: Riechmann 
JL, Wellmer F (eds) Flower development. 
Springer, New York, pp 441–469 
  8.  Barrio RÁ, Hernández-Machado A, Varea C, 
Romero-Arias JR, Alvarez-Buylla E (2010) 
Flower development as an interplay between 
dynamical physical fi elds and genetic networks. 
PLoS One 5(10):e13523 
  9.  Barrio RÁ, Romero-Arias JR, Noguez MA et al 
(2013) Cell patterns emerge from coupled 
chemical and physical fi elds with cell prolifera-
tion dynamics: the  Arabidopsis thaliana root as 
a study system. PLoS Comput Biol 
9(5):e1003026 
 10.  Proost S, Van Bel M, Sterck L et al (2009) 
PLAZA: a comparative genomics resource to 
study gene and genome evolution in plants. 
Plant Cell 21(12):3718–3731 
 11.  Weigel D, Mott R (2009) The 1001 genomes 
project for  Arabidopsis thaliana . Genome Biol 
10(5):107 
 12.  Hawkins RD, Hon GC, Ren B (2010) Next- 
generation genomics: an integrative approach. 
Nat Rev Genet 11(7):476–486 
 13.  Lamesch P, Berardini TZ, Li D et al (2012) The 
Arabidopsis Information Resource (TAIR): 
improved gene annotation and new tools. 
Nucleic Acids Res 40(D1):D1202–D1210 
 14.  Haughn GW, Somerville CR (1988) Genetic 
control of morphogenesis in Arabidopsis. Dev 
Genet 9(2):73–89 
 15.  Rowan BA, Weigel D, Koenig D (2011) 
Developmental genetics and new sequencing 
technologies: the rise of nonmodel organisms. 
Dev Cell 21(1):65–76 
 16.  Bowman JL, Smyth DR, Meyerowitz EM 
(2012) The ABC model of fl ower develop-
ment: then and now. Development 139(22):
4095–4098 
 17.  Lander AD (2010) The edges of understand-
ing. BMC Biol 8(1):40 
 18.  Yaffe MB (2013) The scientifi c drunk and the 
lamppost: massive sequencing efforts in cancer 
discovery and treatment. Sci Signal 6(269):pe13 
 19.  Lee WP, Tzou WS (2009) Computational meth-
ods for discovering gene networks from expres-
sion data. Brief Bioinform 10(4):408–423 
 20.  De Smet R, Marchal K (2010) Advantages and 
limitations of current network inference meth-
ods. Nat Rev Microbiol 8(10):717–729 
 21.  Villaverde AF, Banga JR (2014) Reverse engi-
neering and identifi cation in systems biology: 
strategies, perspectives and challenges. J R Soc 
Interface 11(91):20130505 
 22.  Ellner SP, Guckenheimer J (2011) Dynamic 
models in biology. Princeton University Press, 
Princeton, NJ 
 23.  Kell DB, Oliver SG (2004) Here is the evi-
dence, now what is the hypothesis? The com-
plementary roles of inductive and 
hypothesis‐driven science in the post‐genomic 
era. Bioessays 26(1):99–105 
 24.  Dehmer M, Emmert-Streib F, Graber A et al 
(2011) Applied statistics for network biology: 
methods in systems biology. Wiley, New York 
 25.  Hartwell LH, Hopfi eld JJ, Leibler S (1999) 
From molecular to modular cell biology. 
Nature 402:C47–C52 
 26.  Kashtan N, Alon U (2005) Spontaneous evolu-
tion of modularity and network motifs. Proc 
Natl Acad Sci U S A 102(39):13773–13778 
 27.  Espinosa-Soto C, Wagner A (2010) 
Specialization can drive the evolution of modu-
larity. PLoS Comput Biol 6(3):e1000719 
 28.  Mitra K, Carvunis AR, Ramesh SK et al (2013) 
Integrative approaches for fi nding modular 
structure in biological networks. Nat Rev 
Genet 14(10):719–732 
 29.  Mendoza L, Alvarez-Buylla ER (1998) 
Dynamics of the genetic regulatory network 
for  Arabidopsis thaliana fl ower morphogenesis. 
J Theor Biol 193(2):307–319. doi: 10.1006/
jtbi.1998.0701 
 30.  Espinosa-Soto C, Padilla-Longoria P, Alvarez- 
Buylla ER (2004) A gene regulatory network 
model for cell-fate determination during 
 Arabidopsis thaliana fl ower development that 
is robust and recovers experimental gene 
expression profi les. Plant Cell 16:2923–2939 
 31.  Albert R, Othmer HG (2003) The topology of 
the regulatory interactions predicts the expres-
sion pattern of the segment polarity genes in 
 Drosophila melanogaster . J Theor Biol 
223(1):1–18 
 32.  Azpeitia E, Benítez M, Vega I, Villarreal C, 
Alvarez-Buylla ER (2010) Single-cell and cou-
pled GRN models of cell patterning in the 
 Arabidopsis thaliana root stem cell niche. BMC 
Syst Biol 4:134 
Descriptive vs. Mechanistic Network Models in Plant Development…
478
 33.  Azpeitia E, Weinstein N, Benítez M et al 
(2013) Finding missing interactions of the 
 Arabidopsis thaliana root stem cell niche gene 
regulatory network. Front Plant Sci 4:110 
 34.  Caragea D, Welch SM, Hsu WH (2010) 
Handbook of research on computational meth-
odologies in gene regulatory networks. Medical 
Information Science Reference, Hershey, PA 
 35.  Wang R, Li C, Aihara K (2010) Modeling bio-
molecular networks in cells. Springer, New York 
 36.  Lingeman JM, Shasha D (2012) Network infer-
ence in molecular biology. Springer, New York 
 37.  Friedel S, Usadel B, Von Wirén N et al (2012) 
Reverse engineering: a key component of sys-
tems biology to unravel global abiotic stress 
cross-talk. Front Plant Sci 3:294 
 38.  Usadel B, Fernie AR (2013) The plant tran-
scriptome—from integrating observations to 
models. Front Plant Sci 4:48 
 39.  Jaeger J, Sharpe J (2014) On the concept of 
mechanism in development. In: Minelli A, 
Pradeu T (eds) Towards a theory of develop-
ment. Oxford University Press, Oxford, p 56 
 40.  Hua F, Hautaniemi S, Yokoo R et al (2006) 
Integrated mechanistic and data-driven model-
ling for multivariate analysis of signalling path-
ways. J R Soc Interface 3(9):515–526 
 41.  McGeachie MJ, Chang HH, Weiss ST (2014) 
CGBayesNets: conditional Gaussian Bayesian 
network learning and inference with mixed dis-
crete and continuous data. PLoS Comput Biol 
10(6):e1003676 
 42.  Crombach A, Wotton KR, Cicin-Sain D et al 
(2012) Effi cient reverse-engineering of a devel-
opmental gene regulatory network. PLoS 
Comput Biol 8(7):e1002589 
 43.  Mao L, Van Hemert JL, Dash S et al (2009) 
Arabidopsis gene co-expression network and its 
functional modules. BMC Bioinformatics 
10(1):346 
 44.  Feltus FA, Ficklin SP, Gibson SM et al (2013) 
Maximizing capture of gene co-expression rela-
tionships through pre-clustering of input 
expression samples: an Arabidopsis case study. 
BMC Syst Biol 7(1):44 
 45.  Montes RA, Coello G, González-Aguilera KL 
et al (2014) ARACNe-based inference, using 
curated microarray data, of  Arabidopsis thali-
ana root transcriptional regulatory networks. 
BMC Plant Biol 14(1):97 
 46.  Netotea S, Sundell D, Street NR et al (2014) 
ComPlEx: conservation and divergence of co- 
expression networks in  A. thaliana , Populus 
and  O. sativa . BMC Genomics 15(1):106 
 47.  Meyer PE, Lafi tte F, Bontempi G (2008) 
minet: AR/Bioconductor package for inferring 
large transcriptional networks using mutual 
information. BMC Bioinformatics 9(1):461 
 48.  Hansen KD, Gentry J, Long L et al (2009) 
Rgraphviz: provides plotting capabilities for R 
graph objects. R package version 2.8.1. 2009. 
 49.  Usadel B, Obayashi T, Mutwil M et al (2009) 
Co‐expression tools for plant biology: oppor-
tunities for hypothesis generation and caveats. 
Plant Cell Environ 32(12):1633–1651 
 50.  Cho DY, Kim YA, Przytycka TM (2012) 
Network biology approach to complex dis-
eases. PLoS Comput Biol 8(12):e1002820 
 51.  Cramer GR, Urano K, Delrot S et al (2011) 
Effects of abiotic stress on plants: a systems 
biology perspective. BMC Plant Biol 11(1):163 
 52.  Ficklin SP, Feltus FA (2011) Gene coexpres-
sion network alignment and conservation of 
gene modules between two grass species: maize 
and rice. Plant Physiol 156(3):1244–1256 
 53.  Hernández-Lemus E, Velázquez-Fernández D, 
Estrada-Gil JK et al (2009) Information theo-
retical methods to deconvolute genetic regula-
tory networks applied to thyroid neoplasms. 
Phys Stat Mech Appl 388(24):5057–5069 
 54.  Meyer PE, Olsen C, Bontempi G (2011) 
Transcriptional network inference based on 
information theory. In: Dehmer M, Emmert- 
Streib F et al (eds) Applied statistics for net-
work biology: methods in systems biology. 
Weinheim, Wiley-Blackwell, pp 67–89 
 55.  Margolin AA, Nemenman I, Basso K et al 
(2006) ARACNE: an algorithm for the 
reconstruction of gene regulatory networks 
in a mammalian cellular context. BMC 
Bioinformatics 7(Suppl 1):S7 
 56.  Sales G, Romualdi C (2011) parmigene—a 
parallel R package for mutual information esti-
mation and gene network reconstruction. 
Bioinformatics 27(13):1876–1877 
 57.  Kauffman S (1969) Homeostasis and differen-
tiation in random genetic control networks. 
Nature 224:177–178 
 58.  Albert I, Thakar J, Li S, Zhang R, Albert R 
(2008) Boolean network simulations for life 
scientists. Source Code Biol Med 3:16 
 59.  Davila-Velderrain J, Martinez-Garcia JC, 
Alvarez-Buylla ER (2014) Epigenetic land-
scape models: the post-genomic era.  bioRxiv 
 60.  Müssel C, Hopfensitz M, Kestler HA (2010) 
BoolNet - an R package for generation, 
 reconstruction and analysis of Boolean net-
works. Bioinformatics 26(10):1378–1380 
 61.  Huang S (2014) When correlation and causa-
tion coincide. Bioessays 36(1):1–2 
 62.  Lehner B, Lee I (2008) Network-guided 
genetic screening: building, testing and using 
J. Davila-Velderrain et al.
479
gene networks to predict gene function. Brief 
Funct Genomic Proteomic 7(3):217–227 
 63.  Gershenfeld N (1999) The nature of mathe-
matical modeling. Cambridge University Press, 
Cambridge 
 64.  Arellano G, Argil J, Azpeitia E et al (2011) 
“Antelope”: a hybrid-logic model checker for 
branching-time Boolean GRN analysis. BMC 
Bioinformatics 12:490 
 65.  Naldi A, Berenguier D, Fauré A et al (2009) 
Logical modeling of regulatory networks with 
ginsim 2.3. Biosystems 97(2):134–139 
 66.  Corblin F, Fanchon E, Trilling L (2010) 
Applications of a formal approach to decipher 
discrete genetic networks. BMC Bioinformatics 
11(1):385 
 67.  De Jong H, Geiselmann J, Hernandez C et al 
(2003) Genetic network analyzer: qualitative 
simulation of genetic regulatory networks. 
Bioinformatics 19(3):336–344 
 68.  Calzone L, Fages F, Soliman S (2006) Biocham: 
an environment for modeling biological sys-
tems and formalizing experimental knowledge. 
Bioinformatics 22(14):1805–1807 
Descriptive vs. Mechanistic Network Models in Plant Development…
REVIEW
published: 23 April 2015
doi: 10.3389/fgene.2015.00160
Frontiers in Genetics | www.frontiersin.org 1 April 2015 | Volume 6 | Article 160
Edited by:
Moisés Santillán,
Centro de Investigación y Estudios
Avanzados del IPN, Mexico
Reviewed by:
David McMillen,
University of Toronto Mississauga,
Canada
Enrique Hernandez-Lemus,
National Institute of Genomic
Medicine, Mexico
Edgardo Ugalde,
Universidad Autónoma de San Luis
Potosí, Mexico
*Correspondence:
Elena R. Alvarez-Buylla,
Instituto de Ecología, Universidad
Nacional Autónoma de México, 3er
Circuito Exterior, Junto a Jardín
Botánico, Mexico City, D.F. 04510,
México
eabuylla@gmail.com
Specialty section:
This article was submitted to
Systems Biology,
a section of the journal
Frontiers in Genetics
Received: 01 March 2015
Accepted: 08 April 2015
Published: 23 April 2015
Citation:
Davila-Velderrain J, Martinez-Garcia
JC and Alvarez-Buylla ER (2015)
Modeling the epigenetic attractors
landscape: toward a post-genomic
mechanistic understanding of
development. Front. Genet. 6:160.
doi: 10.3389/fgene.2015.00160
Modeling the epigenetic attractors
landscape: toward a post-genomic
mechanistic understanding of
development
Jose Davila-Velderrain 1, 2, Juan C. Martinez-Garcia 3 and Elena R. Alvarez-Buylla 1, 2*
1Departamento de Ecología Funcional, Instituto de Ecología, Universidad Nacional Autónoma de México, Mexico City,
Mexico, 2Centro de Ciencias de la Complejidad (C3), Universidad Nacional Autónoma de México, Mexico City, Mexico,
3Departamento de Control Automático, Cinvestav-Instituto Politécnico Nacional, Mexico City, Mexico
Robust temporal and spatial patterns of cell types emerge in the course of normal
development in multicellular organisms. The onset of degenerative diseases may result
from altered cell fate decisions that give rise to pathological phenotypes. Complex
networks of genetic and non-genetic components underlie such normal and altered
morphogenetic patterns. Here we focus on the networks of regulatory interactions
involved in cell-fate decisions. Such networks modeled as dynamical non-linear systems
attain particular stable configurations on gene activity that have been interpreted as
cell-fate states. The network structure also restricts the most probable transition patterns
among such states. The so-called Epigenetic Landscape (EL), originally proposed by
C. H. Waddington, was an early attempt to conceptually explain the emergence of
developmental choices as the result of intrinsic constraints (regulatory interactions)
shaped during evolution. Thanks to the wealth of molecular genetic and genomic studies,
we are now able to postulate gene regulatory networks (GRN) grounded on experimental
data, and to derive EL models for specific cases. This, in turn, has motivated several
mathematical and computational modeling approaches inspired by the EL concept,
that may be useful tools to understand and predict cell-fate decisions and emerging
patterns. In order to distinguish between the classical metaphorical EL proposal of
Waddington, we refer to the Epigenetic Attractors Landscape (EAL), a proposal that is
formally framed in the context of GRNs and dynamical systems theory. In this review we
discuss recent EAL modeling strategies, their conceptual basis and their application in
studying the emergence of both normal and pathological developmental processes. In
addition, we discuss how model predictions can shed light into rational strategies for cell
fate regulation, and we point to challenges ahead.
Keywords: GRN, epigenetic landscape, attractors, cell-fate, morphogenesis, stem-cells, cancer
1. Introduction
The progressive loss of potency from pluripotent stem cells to mature, differentiated cells, as well
as the reproducible emergence of spatiotemporal patterns through the course of development has
Davila-Velderrain et al. Modeling the epigenetic attractors landscape
been always perceived as strong evidence of the robustness and
deterministic nature of development. The explanation of such
a robust process has puzzled researchers for many years. For a
long time, although not always stated explicitly, the prevailing
paradigm in developmental biology was supported on two fun-
damental paradigms: (1) a mature cell, once established, displays
an essentially irreversible phenotype; and (2) the developmen-
tal process is controlled by a “program” as a genomic blueprint
following a simplistic linear scheme of causation in an essen-
tially deterministic fashion. Experimental and theoretical studies
in the last decade have challenged these assumptions. It has been
shown that a differentiated state of a given cell is not irreversible
as previously thought, and that in fact, it is possible to repro-
gram differentiated cells into pluripotent states with a plethora
of protocols in plants and animals (Grafi, 2004; Takahashi and
Yamanaka, 2006; Takahashi et al., 2007; González et al., 2011).
Overall, a growing body of empirical evidence now supports
intrinsic physical processes as a fundamental source of order
instead of deterministic pre-programmed rules (Huang, 2009;
Mammoto and Ingber, 2010). Although these observations have
just recently shift the focus of study in developmental biology and
biomedical research, the new evidence is in line with the pro-
posals that early theoretical biologists posited decades ago (see,
for exampleWaddington, 1957; Goodwin, 1963; Kauffman, 1969,
1993; Goodwin, 2001). C. H. Waddington was one of the first to
point out that the physical implementation of the information
coded in the genes and their interactions imposes developmental
constraints while forming an organism. Waddington’s heuristic
model of the epigenetic landscape (EL) was a visionary attempt
to consolidate these ideas in a conceptual framework that enables
the discussion of the relationship between genetics, development,
and evolution in an intuitive manner. Waddington’s proposal
was inspired in a formal dynamic systems approach, nonetheless
(Waddington, 1957; Gilbert, 1991; Slack, 2002).
Nowadays in the data-rich, post-genomic era the EL has
been consolidated as a useful conceptual model for the discus-
sion of the mechanistic basis underlying cellular differentiation—
particularly trans-differentiation and reprogramming events
(Alvarez-Buylla et al., 2008; Enver et al., 2009; Fagan, 2012;
Ladewig et al., 2013). This field has become particularly active due
to its potential medical applications using stem cells systems biol-
ogy as a means for discovering efficient reprogramming or thera-
peutic strategies by combining mathematical and computational
modeling with experimental techniques (MacArthur et al., 2008,
2009; Roeder and Radtke, 2009; Huang, 2011; Zhou and Huang,
2011). Recently, though, numerous critiques to Waddington’s
original model have been presented in light of the dynamical
plasticity of differentiated cells (see, for example Balázsi et al.,
2011; Ferrell, 2012; Furusawa and Kaneko, 2012; Garcia-Ojalvo
and Arias, 2012; Sieweke, 2015). In this review, we claim that
the formalization of the EL in the context of the study of the
dynamical properties of GRNs enables a formal framework which
provides the necessary flexibility for a model to be both: (1)
consistent with the observed inherent plasticity of developing
cells and (2) formally derived from the uncovered regulatory
underpinnings of cell-fate regulation. It is thus important to note
that this GRN associated EL model is not to be confused with
the literal, metaphorical model presented by Waddington, which
some authors have associated only to the static diagrammatic
proposal originally put forward (West-Eberhard, 2003). In order
to highlight such distinction, here we will refer to the EL model
associated with the dynamics of GRNs as the epigenetic attractors
landscape (EAL).
The conceptual distinction between the classical EL and the
EAL proposed here, as well as its relevance as a consistent model
for the prevailing theories of differentiation is going to be exposed
by the authors elsewhere. In this contribution we instead focus
on the mathematical approaches which have been developed to
derive an EAL as an extension of the conventional dynamical
analyses of experimentally grounded GRN models. Importantly,
we deliberately use the generic term EAL to refer to a group of
dynamical models which are quite diverse in mathematical prop-
erties and structure, however we do so for phenomenological rea-
sons: all the approaches try to formally tackle the phenomenon of
cellular differentiation taking the classical EL model as a concep-
tual basis. Given the current relevance of such a modeling exer-
cise applied to molecular networks involved on processes such as
stem cell differentiation (Li andWang, 2013), tissue morphogen-
esis (Alvarez-Buylla et al., 2010), and carcinogenesis (Choi et al.,
2012; Wang et al., 2014); and the fact that different approaches
have been proposed in order to reach similar goals (Huang, 2009,
2012; Zhou et al., 2012), we hope that the present integrative
review may prove useful for a wide range of biological applica-
tions. Our main objective is 2-fold: (1) to help different research
groups attempting to formalize the EAL to reduce the gap exist-
ing between current different approaches and (2) to contribute to
shape a common and formal discussion ground on EAL models
among experimentalists and theoretical biologists. Accordingly,
we have decided to favor conceptual clarity over technicalities
through the text, and to point to original references where more
detail is available if necessary. We apologize for the theoretically
oriented reader for the lack of mathematical formality.
1.1. The Dynamical-Systems View of Cell Biology
The modern picture of the EL is framed in the context of GRN
dynamics (Kauffman, 1969; Mendoza and Alvarez-Buylla, 1998;
Huang, 2012), and its theoretical basis is a dynamical-systems
perspective. From here on we will refer to this view of the EL
model as the EAL. Under dynamical-systems framework a cell is
considered a dynamical system, assuming that its state at a certain
time can be described by a set of time-dependent variables. As
a first approximation, it is commonly assumed that the amount
of the different proteins within the cell or, for practical reasons,
the levels of gene expression (i.e., expression profiles) are suffi-
cient to describe such state (Huang, 2013). Thus, the expression
profile is conventionally taken as the set of variables represent-
ing the state of the cell; each gene in the cell’s GRN representing
one variable (see Figure 1). Mathematically, the set of variables is
represented as a state vector given by x(t) = [x1(t), x2(t)..., xn(t)]
for a GRN with n genes. Given such specification, it is useful to
imagine an abstract space termed the state space of the system.
In the context of GRNs the state space comprises all the theoret-
ically possible states a cell can exhibit; each point in this abstract
space represents one particular expression profile (Figure 1B).
Frontiers in Genetics | www.frontiersin.org 2 April 2015 | Volume 6 | Article 160
Davila-Velderrain et al. Modeling the epigenetic attractors landscape
FIGURE 1 | From experimental data on gene function and interactions
to a dynamic gene regulatory network and epigenetic landscape
model. (A) The architecture of a GRN is proposed given available
experimental molecular data; the state of the network is specified as a gene
expression profile or gene on/off (1/0) configuration for the case of
continuous or discrete state models, respectively. Boolean or differential
equations are used, respectively. (B) The complete set of possible states
define a continuous (above) or discrete (below) state space, where each
state corresponds to a point; changes in gene expression during
developmental dynamics manifest as trajectories in this abstract space (here
depicted as arrows). (C) In an intuitive characterization of an EL, an
“elevation” value U(x) is associated to each network state x. The association
of “elevation” values to network states, or more generally, the quantitative
characterization of their relative stability is the ultimate goal of EL modeling
efforts. For illustrative purposes, the EL is depicted here as a hypothetical
low-dimensional projection.
Furthermore, it is assumed that the cell state at a certain time and
the cell state at a later time are connected by a state trajectory in
a causal way.
Mathematically, the current cell state is a function or a
more general mapping of the initial state and certain additional
parameters. The connection between cell states can be formally
expressed by a dynamical equation,
x(t + δt) = F(x(t),u, δt), (1)
where F represents the map that connects one state with the
immediately previous sate (F is also known as the transition
map), x(t) denotes the state at a certain time t, and u stands for
the vector of additional parameters. Both the time increments
δt and the state variables xi(t) can be either continuous or dis-
crete, depending on the chosen mathematical formalism. Within
the cell, the map F is implemented by the architecture of the
GRN, which specifies both the topology of the network and the
nature and form of the corresponding gene regulations (Huang,
2009). Because of globally conditioned gene behavior due to
mutual gene regulatory interactions, through the causal connec-
tions between cell states, the GRN imposes dynamical constraints
and limits the permissible behavior of the cell. Of special interest
are the transient and emergent stable configurations that the cell
may attain as a result. The existence of the dynamical map F
expresses the causality of the cellular developmental process and
the mechanistic character of GRN dynamical models.
One of the most salient and impressive features of GRNs is the
existence of a small number of stationary or quasi-stationary gene
configurations within the state space (Kauffman, 1969). Given a
specific GRN, a set of cell states satisfy the constraints imposed by
the GRN; that is, each of these cell states is connected to itself by
the map F (i.e., x∗ = F(x∗,u)). When these steady states (x∗) are
also resilient to perturbations, that is, if they return back to the
steady state after being kicked away by state variations either of
intrinsic or external origin, we refer to them as attractors. In the
case of quasi-stationary states, if a set comprised of several indi-
vidual states repeats in a cyclic manner it corresponds to a cyclic
attractor. All other states are either unstable or form part of tran-
sitory trajectories channeled toward one of these attractor states.
The theory posits that attractor states correspond to the observ-
able robust cell phenotypes, cell types, or cellular processes; and
that these emerge as a natural consequence of the dynamical con-
straints imposed by the underlying GRN (Huang and Kauffman,
2009; Huang, 2013). For a more formal definition of attractors in
dynamical systems theory see (Fuchs, 2013a).
Frontiers in Genetics | www.frontiersin.org 3 April 2015 | Volume 6 | Article 160
Davila-Velderrain et al. Modeling the epigenetic attractors landscape
1.2. Extending GRN to EAL Models
The postulation of experimentally grounded GRN dynamical
models, their qualitative analysis and dynamical characterization
in terms of control parameters, and the validation of predicted
attractors against experimental observations has become a well-
established framework for the study of developmental dynamics
in systems biology—see, for example: (Mendoza and Alvarez-
Buylla, 1998; Von Dassow et al., 2000; Albert and Othmer, 2003;
Espinosa-Soto et al., 2004; Huang et al., 2007; Graham et al.,
2010; Sciammas et al., 2011; Hong et al., 2012; Jaeger and Crom-
bach, 2012; Azpeitia et al., 2014). The qualitative analysis of the
dynamics of GRNmodels is well-suited for the study of the spec-
ification of cell fates as a result of the constrains imposed by the
associated GRN. This conventional analysis includes the iden-
tification and local characterization of attractor states, and the
comparison of these predicted cell-type configurations with the
ones that are actually observed in the corresponding biological
system (Figures 1A,B).
If one is interested in studying the potential transition events
among the already characterized stable cellular phenotypes, how-
ever, several difficulties arise. Standard analysis of dynamical sys-
tems, which focuses on the existence and local properties of a
given attractor, fail to capture the main problem which is con-
cerned with the relative properties of the different attractors
(Zhou et al., 2012). In deterministic GRN models, given cer-
tain values for the related control parameters, the system under
study always converges to a single attractor if initialized from the
same state, and once it attains such steady-state it remains there
indefinitely. In contrast, during a developmental process, cells
change from one stable cell configuration to another one in spe-
cific temporal and spatial or morphogenic patterns. Additional
formalisms are needed in order to explore questions regarding
how cells in the course of differentiation transit among avail-
able given attractors, or the order in which the system con-
verges to the different attractors given an initial condition; as well
as to predict how these mechanisms can be altered by rational
strategies.
1.2.1. EAL Modeling Goals
The need for extending GRN dynamical models beyond stan-
dard local analysis is related with the interest in addressing
the following—and similar—questions. Conceptually, given an
experimentally determined GRN, how can we explain and pre-
dict both specific “normal” and altered cellular differentiation
events or morphogenic patterns? Is it possible to control the
fate of differentiation events through well-defined stimuli? Can
we deliberately cause altered morphogenic patterns by means of
either genetic, physical, chemical or other type of environmental
perturbations? Or formally, given a specific dynamical mapping
x(t + δt) = F(x(t),u, δt), and its associated state space, how can
we study the conditions under which a transition event occurs
among the attractor states x∗? Is there a reproducible pattern
of transitions? Can we alter the expected pattern through spe-
cific external control perturbations u? To what extent are the
observed robust and altered temporal or spatial morphogenetic
patterns emergent consequences of the GRNs? The extension of
GRN dynamical models and their analysis in order to address
these and similar questions has shown to be a fruitful area of
research in recent years (Han and Wang, 2007; Alvarez-Buylla
et al., 2008; Wang et al., 2010b, 2014; Choi et al., 2012; Qiu et al.,
2012; Villarreal et al., 2012; Zhou et al., 2012; Li and Wang, 2013;
Zhu et al., 2015). The conceptual basis for most of these efforts is
the EAL.
1.3. Deterministic EAL Models from Genetic
Circuits
1.3.1. An Introductory Toy Model
A quite simple auto-activating single-gene circuit, a basic model
of cell differentiation induction, is exposed in Ferrell (2012) as a
conceptual tool to discuss some difficulties regarding Wadding-
ton’s EL. In this work an EAL is mathematically described by
a potential function. In dynamical systems theory, besides the
state space approach explained briefly above, there is another
way to visualize the dynamics of a system, but applicable only
if the system is simple enough: the potential function (Strogatz,
2001; Fuchs, 2013a). The potential is a function V(x) which (in
one-dimensional systems) fulfills the relation given by:
dx
dt
= f (x) = −
dV(x)
dx
, (2)
i.e., f (x) is the negative derivative of the potential, which can be
found by direct integration:
V(x) = −
∫
f (x)dx. (3)
Such a function defines an attractor landscape for the given
dynamical system, and its plot graphically represents the dynam-
ics of the system (Figure 2). Specifically, minima of the potential
correspond to fixed-point attractors (e.g., cell types), and max-
ima correspond to unstable fixed-points. The motion, i.e., the
state trajectories are given by the gradient lines (the lines of steep-
est descent of the potential). The trajectories are attracted by the
minima of the potential. This corresponds to an intuitive, direct
derivation of the EAL: a “height” value is associated to each of
the points in the state-space in a way that those regions corre-
sponding to attractors will have a lower value than that of the
other transitory states (Figure 2C). Conceptually, the rolling ball
of Waddington’s EL will represent the state of a differentiating
cell moving from higher to lower regions in state space. Thus,
the calculated heights of the different attractors are expected to
reflect their developmental potential in a hierarchical way: the
lower height the lower potential for differentiation.
All one-dimensional systems have a potential function, but
most two- or higher-dimensional systems do not (Fuchs, 2013a).
This means that one could only apply this method if the cell is
represented by a single-gene (single variable) circuit. Further-
more, note that here the EAL plays the role of a “toy” model
useful in conceptual discussions, a role quite relevant (see Fer-
rell, 2012) but similar to that of the original metaphorical pro-
posal of Waddington. In this review we devote more attention
to the application of EAL models to real specific developmental
Frontiers in Genetics | www.frontiersin.org 4 April 2015 | Volume 6 | Article 160
Davila-Velderrain et al. Modeling the epigenetic attractors landscape
FIGURE 2 | The derivation of a potential function to visualize the
epigenetic landscape and the dynamics of a system. (A) The
causal connection between the state of the system (an auto-activating
gene circuit) at a certain time and its state at a later time is modeled
by a differential equation. (B) Attractors in rate-balance analysis. The
blue dotted line and the red curve represent respectively a linear
degradation rate, and a non-linear synthesis rate for the circuit’s gene;
the restrictions imposed by the circuit to the systems dynamics are met
when both rates are balanced. The states that meet this balanced
condition are stationary, and if stable (filled circles), are denoted as
attractor states x*. Circles represent stationary states. (C) The potential
function. The attractor states x* lie at the bottoms of valleys (minima).
The trajectories starting from unstable, transitory states are attracted by
the minima of the potential. The relative stability of the left (right)
attractor with respect to the other is lower (higher) as quantified by the
lower (higher) barrier height between them.
processes with explanatory and predictive purposes that gener-
ally involve n-dimensional GRN. Thus, a more “realistic” sub-
network model incorporating several transcription factors in a
modular structure is necessary in such cases. The application of
the integration-based potential function approach, however, can-
not be applied to cases with a higher number of genes. Also, one
should be cautious when postulating the existence of a potential
for living systems in strict sense: a cell is an open non-equilibrium
thermodynamical system, and its dynamics in general does not
follow a gradient (since the transition rate between two given
attractor states is not path-independent). For details, see (Zhou
et al., 2012; Huang, 2013). For this reason authors use the term
“quasi-potential” when speaking about cellular dynamics from a
system-dynamics point of view (see below).
In the general case, the dynamics of continuous-time models
of GRNs is given by more general types of autonomous differen-
tial equations (DEs). The time evolution of the cell state x(t) is
commonly modeled by the system of DEs:
dxi(t)
dt
= Fi(x1, x2, ..., xf ,u), (4)
where i = 1, 2, ..., n for a GRN of n genes. A dynamics defined by
such a general DE is a special form of the map in Equation (1). In
general, the functions F in the continuous-timemodel for cellular
dynamics (Equation 4) are non-linear, and cannot be analytically
integrated and derived from a gradient. Numerical approaches
have been proposed to draw a deterministic “quasi-potential” for
two-gene circuits (see, for example Bhattacharya et al., 2011). In
what follows we focus on medium size GRN modules, where
neither the direct integration nor the numerical deterministic
approach are applicable. We start with the simplest models of
GRN dynamics.
1.4. Stochastic EAL Models from Boolean GRNs
The first computational model envisioned for the simulation and
analysis of the dynamic behavior of GRNs was the Boolean Net-
work (BN) model (Kauffman, 1969, 1993). This model has been
extended to model various developmental processes in the con-
text of the EAL (Han andWang, 2007; Alvarez-Buylla et al., 2008;
Ding andWang, 2011; Choi et al., 2012; Flöttmann et al., 2012). A
BN models a dynamical system assuming both discrete time and
discrete state. This is expressed formally with the mapping:
xi(t + 1) = Fi(x1(t), x2(t), ..., xf (t)), (5)
where the set of functions Fi are logical propositions expressing
the relationship between the genes that share regulatory interac-
tions with the gene i, and where the state variables xi(t) can take
the discrete values 1 or 0 indicating whether the gene i is active or
not at a certain time t, respectively. An experimentally grounded
Boolean GRN model is then completely specified by the set of
genes proposed to be involved in the process of interest and the
associated set of logical functions derived from experimental data
(Azpeitia et al., 2014). A dynamics defined by such a mapping is
a special form of the map in Equation (1).
Frontiers in Genetics | www.frontiersin.org 5 April 2015 | Volume 6 | Article 160
Davila-Velderrain et al. Modeling the epigenetic attractors landscape
1.4.1. Attractor Transition Probability Approach to
Explore the EAL
As stated above, in a deterministic framework, once a cell state
corresponds to an attractor, it will remain there indefinitely.
The set of conditions that lead to each attractor comprise the
attracting basin. Under stochastic fluctuations, the borders of
attractor regions in state space may be reached and may be
crossed, leading to transitions from one attractor to another
one (Ebeling and Feistel, 2011). Thus, the implementation of
an stochastic dynamical model opens the opportunity to study
signal-independent transitions among attractors. There are sev-
eral approaches to include stochasticity in dynamical models.
One approach is based on the idea of introducing transition
probabilities. As discussed above, when studying cellular devel-
opmental dynamics, the transitions of interest are those among
attractor states. Can these transitions be studied in terms of
probabilities? Indeed, since Boolean GRN can be extended to
include stochasticity and transition probabilities among attrac-
tors can then be estimated. Several ways to include stochasticity
in a Boolean GRN model have been proposed (Garg et al., 2012).
One way is the so-called stochasticity in nodes (SIN)model. Here,
a constant probability of error ξ is introduced for the deter-
ministic Boolean functions. In other words, at each time step,
each gene “disobeys” its Boolean function with probability ξ .
Formally:
Pxi(t+1)[Fi(xregi(t))] = 1− ξ,
Pxi(t+1)[1− Fi(xregi(t))] = ξ.
(6)
The probability that the value of the now random variable xi(t+1)
is determined or not by its associated logical function Fi(xregi (t))
is 1− ξ or ξ , respectively.
Alvarez-Buylla and collaborators used this extended BN
model to explore the EAL associated with an experimentally
grounded GRN (Alvarez-Buylla et al., 2008) (see below). In a BN
model the set of possible states is finite. Specifically, due to its
binary state character the state space of a Boolean GRN with n
genes has a size of 2n and is composed by the set of all possi-
ble binary vectors of length n (see Figure 3A). By simulating a
stochastic one-step transition, according to the model in Equa-
tion (6) and the mapping in Equation (5), and starting from each
of all the possible states in the system for a large number of times,
it is possible to estimate the probability of transition from an
attractor i to an attractor j as the frequency of times the states
belonging to the basin of the attractor i are mapped into a state
within the basin of the attractor j. For detail see (Azpeitia et al.,
2014). In Alvarez-Buylla et al. (2008), the authors followed this
simulation approach to estimate a transition probability matrix
5 with components:
πij = P(At+1 = j|At = i), (7)
representing the probability that an attractor j is reached from an
attractor i (Figure 3B). Once the set of attractors is known and
the transition probabilitymatrix is estimated, it is straightforward
to implement a discrete time Markov chain model (DTMC) and
obtain a dynamic equation for the probability distribution (for
details, see Allen, 2010):
PA(t + 1) = 5PA(t), (8)
where PA(t) is the probability distribution over the attractors
at time t, and 5 is the transition probability matrix previously
estimated. This equation can be iterated to simulate the tempo-
ral evolution of the probability distribution over the attractors
starting from a biologically meaningful initial distribution. The
extension of a Boolean GRN in order to apply this approach
is quite simple and intuitive; however, there is a limitation that
impedes its general applicability: as the size of the GRN grows, it
becomes difficult to exhaustively characterize the attractor’s land-
scape associated with the GRN in terms of the emergent attractors
and its corresponding basins of attraction. If the dynamics of the
Boolean GRN is not exhaustively characterized, the correspond-
ing transition probabilities among attractors cannot be estimated
using the proposed approach. Additionally, other implementa-
tions of stochasticity within BN models have been discussed
(Garg et al., 2012). Additional examples should be worked out
with such various approaches to test which is more practical and
if all yield equivalent results.
1.4.2. Probabilistic Landscape (Quasi-potential)
Approach
Han and Wang proposed a different approach in order to extend
a BN model. Their goal was to first estimate the one-step transi-
tion probabilities among all the possible states in the state space
and not just among given attractors (Han and Wang, 2007). For
this, they implemented a variation of the BN that was previously
proposed by Li and collaborators (Li et al., 2004) and which has
been called the threshold network formalism (Thompson and
Galitski, 2012). In this model, the structure of the network is
formally represented with an adjacency matrix C, whose com-
ponents cij indicating the nature and strength of the interaction
from the gene j to gene i. The dynamic mapping for this BN
model takes the form:
xi(t + 1) =



















1,
∑
j
cijxj(t)+ bi > 0,
0,
∑
j
cijxj(t)+ bi < 0,
xi(t),
∑
j
cijxj(t)+ bi = 0,
(9)
where bi is a parameter representing the ground state of the gene
i: its state in the absence of regulation. The set of parameters
(i.e., bi and cij) can be chosen to force the dynamics of the BN
to be consistent with those of a BN with a specific set of logi-
cal propositions (for details, see Supplementary Material in Choi
et al., 2012). The mapping in Equation (9) can be conceptualized
as follows: if the total input of a gene in the network is positive
(activation), negative (repression) or zero; the future state of the
gene will be active, inactive or unchanged from its previous state,
respectively. Here, the total input of a gene is the sum of the pre-
vious states of the genes regulating it. The characterization of the
Frontiers in Genetics | www.frontiersin.org 6 April 2015 | Volume 6 | Article 160
Davila-Velderrain et al. Modeling the epigenetic attractors landscape
FIGURE 3 | Stochastic epigenetic landscape models from Boolean
dynamics. (A) A simple mutual-inhibition circuit is modeled as a Boolean
network: discrete temporal evolution and binary (0,1) state variable. The
discrete state space corresponds to the set of binary vectors (here 4 possible
states) and is partitioned by two basins of attraction. (B) There are 4(22 )
possible transitions among the two attractors. A 2× 2 transition probability
matrix specifies the probability of each possible transition. (C) Han and Wang
proposed the use of a sigmoidal function of the total regulatory input (Ri) to
calculate the probability of a one-step state transition of one gene i (Han and
Wang, 2007). A specific value of the function (vertical red dotted line) gives
the probability of the gene i becoming active (left) or inactive (right), given its
regulatory input (Ri) in the current time. (D) There are 16(22 × 22 ) possible
transitions among the 4(22 ) possible states. A 22 × 22 transition probability
matrix specifies the probability of each possible transition.
entire attractor’s landscape can then be done through numeri-
cal iterations of this dynamical map as long as the network has
a moderate size.
Han and Wang extended the deterministic BN model into a
probabilistic framework by introducing a transition probability
matrix. However, if the interest is focused on the computation of
the probability of transition from one state to another state for
each of the 2n possible phenotypes in state space, then it is neces-
sary to introduce a transition probability matrix with the proba-
bility of all possible transitions and not just among attractors. In
order to make such computation feasible, Han and Wang intro-
duced a simplification: they assumed that the one-step transition
probability of one state to another can be expressed as the prod-
uct of the probability of each gene in the network being activated
or not, given the state of the network in the previous time (for
details, see Han and Wang, 2007, and Supplementary Material in
Choi et al., 2012). Formally:
πkj = P(x(t+ 1) = k|x(t) = j) =
n
∏
i=1
P(xi(t+ 1)|x(t) = j), (10)
where j and k represent two different cell states and can take
values from [1, ..., 2n]; n is the number of genes in the network.
The factorized transition probabilities are calculated by inserting
a non-zero regulatory input (
∑n
j= 1 cijxj(t) + bi(t) 6= 0) as the
argument of a sigmoidal function whose range spans from 0 to 1,
which is to say:
P(xi(t + 1) = 1|x(t) = j) =
1
2
±
1
2
tanh

µ
n
∑
j= 1
cijxj + bi

 .
(11)
In the case of no input (i.e.,
∑n
j= 1 cijxj(t)+bi = 0) a small-valued
parameter d is introduced:
P(xi(t + 1) = xi(t)|x(t) = j) = 1− d.
Hence, in this approach, the probability that a gene iwill be active
(1) at a future time t + 1 will be closer to one as long as its total
input at the previous time t is high. Similarly, the probability of
being inactive (0) at the future time will be closer to 1 as long as
the regulatory input is low (see Figure 3C). On the other hand, if
there is no input to the gene, the probability of no change from
its previous state is close to 1, and the closeness depends on the
parameter d, a small number representing self-degradation. Intu-
itively, these rules ensure that the state of a gene will flip only if
its total input is large enough.
Frontiers in Genetics | www.frontiersin.org 7 April 2015 | Volume 6 | Article 160
Davila-Velderrain et al. Modeling the epigenetic attractors landscape
After having calculated these probabilities, the general idea is
then to use this information to obtain an appropriate “height”
measure for each of the 2n states. With this in mind, the interest
is first in calculating a steady-state probability distribution PSS(x).
This stationary probability distribution is analogous to stationary
configurations in the deterministic case; however, in the stochas-
tic framework, the probability of being in any particular state,
rather than the state of the system, is what is kept invariant along
time. In other words, when this stationary distribution is reached,
the probability of observing a cell in a particular state does
not change. Intuitively, one would expect that attractors would
have a higher probability of being reached than transitory states.
Thus, from a landscape perspective, the potency of differentiation
and height should be inversely related with the probability. The
approach that has been followed is to associate this PSS(x) with a
height value.Wang has proposed that the probability distribution
for a particular state P(xi) = exp[U(xi)], and from this expres-
sion thenU(xi) = − ln P(xi), where i = 1, ..., 2n. This functionU
has been termed the (probabilistic) quasi-potential (Huang, 2009,
2012; Wang et al., 2010b)]. How are the “quasi-potential” and the
steady-state probability formally related to each other is still an
open research area (Zhou et al., 2012) (see below).
The key point which has been emphasized by Wang and
coworkers is that, although there is (in general) no potential
function directly obtainable from the deterministic equations for
a given network, a generalized potential (or “quasi-potential”)
function can be constructed from its probabilistic description.
This generalized potential function is inversely related to the
steady-state probability (Wang et al., 2006; Han andWang, 2007;
Lapidus et al., 2008). For the case of the extended BN model,
once the transition matrix is calculated, the information of the
steady-state probabilities can be obtained by solving a discrete set
of master equations (ME) for the network (Han andWang, 2007).
The so-called ME is a dynamical equation for the temporal evo-
lution of a probability distribution (for details, see Haken, 1977;
Gardiner, 2009). In discrete form it is written as:
∂
∂t
P(xi) =
∑
j
WjiP(xj)−
∑
j
WijP(xi), (12)
where we usedWij to denote the transition probabilities resulting
from Equation (11). The difference between this dynamical equa-
tion and the one discussed in the previous section is that here the
time variable is treated as a continuous one. In general, it is quite
complicated to analyze MEs. In the case of this model, one ME is
obtained for each of the 2n states. Han andWang propose to ana-
lyzed the whole set of equations following a numerical (iterative)
method starting from uniform initial conditions Pxi (t0) = 1/2n
and iterating the system until a stationary distribution is reached
(Han and Wang, 2007).
1.5. Stochastic EAL Models from Continuous
GRNs
As in the case of the deterministic BN model revised above, a
general deterministic system of DEs used to describe a GRN can
be extended in order to include stochasticity. Such continuous
models may be more appropriate to approach certain biological
processes. The most intuitive extension considers the introduc-
tion of driving stochastic forces. In this approach, Equation (4) is
extended to:
d xi(t)
dt
= Fi(x,u)+ ξi(t), (13)
where ξi(t) is the ith component of a driving stochastic force with
zero mean value (i.e., < ξi(t) >= 0). This description, the so-
called Langevin equation, is frequently used to model cellular
dynamics under stochastic fluctuations (Hoffmann et al., 2008;
Wang et al., 2010b; Villarreal et al., 2012; Li and Wang, 2013).
Although intuitively simple at first sight, the consideration of
a randomly varying quantity affecting the dynamics of the sys-
tem implies several conceptual issues that should be considered
in some detail. Any single cell will follow an erratic trajectory
in state space, and its developmental dynamics will make each
realization different even if it starts from exactly the same initial
condition. Under this stochastic scenario, two equivalent perspec-
tives to study the stochastic dynamics can be considered. On the
one hand, the analysis could be focused on trajectories described
by Langevin-type equations, which describe the developmental
dynamics of a single cell (Figure 4A). On the other hand, as the
stochastic forces ξi(t) vary from cell to cell in an ensemble (pop-
ulation) of cells, the state x(t) will also vary from cell to cell at
any given time. One therefore may ask for the probability P(x, t)
to find the state of a cell in a given state interval of the state
space or, equivalently, for the frequency of cells in the ensem-
ble whose states are in that state interval. In the latter situation,
the focus shifts from the dynamics of the state of one cell to the
dynamics of the distribution over the states in a given ensemble
of cells. Indeed, an equation for the temporal evolution of this
distribution P(x, t) can be constructed, and this corresponds to
the so-called Fokker-Plank equation (FPE):
∂P
∂t
= −
∑
i
∂
∂xi
[Ai(x)P]+
1
2
∑
i,j
Qi,j(x)
∂2
∂xi∂xj
P. (14)
In mathematical terms, the corresponding process is known as a
diffusion process, a mathematical model for stochastic phenom-
ena evolving in continuous time; the vector A(x) is known as
the drift vector and the matrix Q(x) as the diffusion matrix (for
details, see Risken, 1984; Gardiner, 2009; Fuchs, 2013b). The FPE
describes the change of the probability distribution of a cell state
during the course of time (Figure 4B). Conceptually, the latter
modeling perspective can be interpreted as the temporal evolu-
tion of a cloud (ensemble) of cells diffusing across the state space
following both attracting and stochastic forces (see Huang, 2010
for a conceptual perspective).
The stochastic nature of the trajectories also produce quali-
tatively richer dynamics in state space. For example, if one is
interested in the developmental connection between one spe-
cific initial cell state and one specific final cell state—for exam-
ple, two different given attractors—there is no longer a single
Frontiers in Genetics | www.frontiersin.org 8 April 2015 | Volume 6 | Article 160
Davila-Velderrain et al. Modeling the epigenetic attractors landscape
FIGURE 4 | Different approaches to study continuous-time
stochastic models of the epigenetic landscape and developmental
dynamics. (A) A continuous-time stochastic (diffusion) model is driven by a
drift (deterministic) component F and an stochastic (Noise) force. The
graph shows 10 different realizations of the stochastic dynamics of the
same, single cell starting form exactly the same initial condition (red dot).
This realizations perspective corresponds to the Langevin equation
description. The right histogram represents an approximation of the
corresponding distribution over the realizations. (B) The picture represents
the time evolution of a hypothetical probability distribution. A population of
cells initially presents a narrow distribution centered at an intermediate state
value: most cells have an intermediate state and no individuals show an
extreme (low or high) value. As time evolves the shape of the distribution
changes—gets wider—, and the population reaches lower and higher
values. This perspective corresponds to the Fokker-Planck equation
description. (C) A cell can follow different paths (gray dotted lines) to reach
a final state xf starting from an initial state x0. A finer quantitative
characterization of the specific transition from state x0 to state xf in terms
of highly probable paths and difficulty of differentiation processes can be
gained by means of calculating a dominant path (red line) for the transition
using a path-integral formalism. For simplicity, the cell state is represented
by one variable x in all three cases.
possible path connecting them. Instead, the same final cellu-
lar phenotype can be reached following different paths in state
space (Figure 4C). This situation raises yet additional interesting
issues: are all the paths equally probable? Is there a dominant path
for such a transition from one attractor to another one? Physicists
have proposed the so-called path-integral formalism in order to
tackle these and similar questions (Wio, 1999). Specifically, one
may want to answer what is the probability of starting from an
initial cellular phenotype at a certain time and ending in another
cellular phenotype at a future time. The conceptual basis of this
strategy is based on the idea of calculating an average trajectory
(e.g., integrating over the possible paths). The calculated aver-
aged path corresponds to the dominant path that the underlying
process is expected to preferentially follow (Figure 4C).
Given the intuitive appeal of a landscape perspective to general
dynamics, the existence of a potential or “potential-like” func-
tion associated with diffusive systems has been an intensive focus
of study in theoretical physics and applied mathematics. Ao and
co-workers have proposed a transformation that allows the defi-
nition of a functionU(x) which successfully acquires the dynami-
cal meaning of a potential function. The corresponding approach
has been applied successfully to study several biological systems
such as the phage lambda life cycle (Zhu et al., 2004), and the car-
cinogenesis processes, Ao et al. (2008), Wang et al. (2013, 2014),
and Zhu et al. (2015) from a landscape perspective. This trans-
formation has also been discussed recently in the context of gen-
eral methods for the decomposition of multivariate continuous
mappings F(x) and their associated quasi-potentials (Zhou et al.,
2012). From the available decomposition methods, the one that
has been applied the most to specific developmental processes is
the potential landscape and flux framework proposed by Wang
et al. (2008). In this framework, the continuous dynamical map-
ping F(x) is decomposed into a gradient part and a flux, curl part
(for details, see Wang, 2011). This approach has been applied, for
example, to the study of the yeast cell cycle [Wang et al. (2006,
2010a)]; a circadian oscillator (Wang et al., 2009); the generic
processes of stem cell differentiation and reprogramming (Wang
et al., 2010b; Xu et al., 2014); and neural differentiation (Qiu et al.,
2012). Recently, this method has been applied in the context of
the differentiation and reprogramming of a human stem cell net-
work (Li and Wang, 2013). Here we further discuss the latter as
a diffusion landscape approach to study stem cell differentiation.
Frontiers in Genetics | www.frontiersin.org 9 April 2015 | Volume 6 | Article 160
Davila-Velderrain et al. Modeling the epigenetic attractors landscape
Although the technical details of decomposition methods for dif-
fusive systems from a landscape perspective are out of the scope
of the present review, we point the reader to Ao (2004), Kwon
et al. (2005), Yin and Ao (2006), Ao et al. (2007), Ge and Qian
(2012), Zhou et al. (2012), and Lv et al. (2014) for further details.
To summarize this section: when a stochastic component with
specific properties is introduced in a continuous-time dynami-
cal model of developmental dynamics, the behavior of the sys-
tem can be studied from different, mathematically equivalent
perspectives. One of the perspectives could be more appropri-
ate than the others, given the biological question of interest; the
different perspectives complement each other, nonetheless. It is
important to note that the three approaches mentioned above
(e.g., Langevin, FPE, and path-integral) although just recently
introduced in systems biology (Wang et al., 2010b, 2011; Villar-
real et al., 2012; Zhang and Wolynes, 2014; Wang et al., 2014);
are actually well-established tools in non-equilibrium statisti-
cal mechanics and the stochastic approach to complex systems
(Haken, 1977; Lindenberg and West, 1990; Gardiner, 2009).
1.6. From EAL Models to Biological Insights
1.6.1. EL Exploration in Flower Morphogenesis
Alvarez-Buylla and collaborators applied the attractor transition
probability approach (Equations 5–8 and Figure 3B) to explore
the EAL explained above in order to study flower patterning
shared by most angiosperms or flowering species (Alvarez-Buylla
et al., 2008). In flowering plants, a floral meristem is sequen-
tially partitioned into four regions from which the floral organ
primordia are formed and eventually give rise to sepals in the out-
ermost whorl, then to petals in the second whorl, stamens in the
third, and carpels in the fourth whorl in the central part of the
flower. This spatiotemporal pattern is widely conserved among
angiosperms. Can the temporal pattern of cell-fate attainment
be explained by the interplay of stochastic perturbations and the
constraints imposed by a non-linear GRN? Starting from the pre-
viously characterized Boolean GRN of organ identity genes in the
A. thaliana flower (Espinosa-Soto et al., 2004), and applying the
stochastic approach described in Equations (5–8), the authors
showed that the most probable order in which the attractors
are attained is, in fact, consistent with the temporal sequence in
which the specification of corresponding cellular phenotypes are
observed in vivo. The model provided, then, a novel explanation
for the emergence and robustness of the ubiquitous temporal pat-
tern of floral organ specification, and also allowed predictions on
the population dynamics of cells with different genetic configura-
tions during development (Alvarez-Buylla et al., 2008). Note that
in this approach, through the calculation of transition probabili-
ties among attractors, it is possible to explore the EAL associated
with a GRN. It also constitutes a new approach to understand-
ing a morphogenic process and also implies that GRN topologies
could have, in part, evolved in response to noisy environments.
In the same contribution, the authors also showed that a stochas-
tic continuous approximation of the GRN under analysis yielded
consistent results. Importantly, in this study it was argued that the
fact that observed patterns of cell-fate transitions could be signif-
icantly constrained by GRN in the context of noisy perturbations
does not excludes the relevance of deterministic signals.
1.6.2. From Probabilistic Landscapes to Putative
Cancer Therapies
The probabilistic landscape (quasi-potential) approach has been
applied to two specific processes: cell cycle regulation (Han and
Wang, 2007), and DNA damage response (Choi et al., 2012). In
the former case, the focus was on the global robustness prop-
erties of the network. Here we discuss the biological implica-
tions derived from the latter case. Choi and collaborators applied
this BN probabilistic landscape approach (Equations 9–12 and
Figures 3C,D) to study state transition in a simplified network
of the p53 tumor suppressor protein. The analysis of this net-
work from an EAL perspective allowed the systematic search for
combinatorial therapeutic treatments in cancer (Wang, 2013).
Given the network, key nodes and interactions that control p53
dynamics and the cellular response to DNA damage were identi-
fied by conducting single node and link mutation simulations;
as a result, one network component, the molecule Wip1, was
identified as one of the critical nodes. The flexibility of the BN
model also enabled the specification of a MCF7 cancer cell by
fixing the state of three nodes of the “normal” network in the
course of simulations (for details, see Choi et al., 2012; Wang,
2013). Having specified two different network models, it was pos-
sible to compare the dynamics and associated quasi-potential
of both normal and cancer cells in the absence and presence
of DNA damage. Previous experimental observations indicated
that prolonged p53 activity induces senescence or cell death; this
behavior was shown to result from the inhibition of the interac-
tion between the molecules Mdm2 and p53 caused by the action
of the small molecule Nutlin-3 (Purvis et al., 2012). Using the
model, Choi and collaborators predicted that neither Wip1 nor
Mdm2-p53 interaction mutation alone were sufficient to induce
cell death for MCF7 cancer cells in the presence of DNA damage;
furthermore, the model provided a mechanistic explanation for
this behavior: the effect of each of this perturbations alone is not
enough tomove the system out of an specific attractor’s basin. But
the simultaneous application of the two perturbations may drive
cancer cells to cell death or cell senescence attractors. These the-
oretical predictions were then validated using single-cell imaging
experiments (Choi et al., 2012; Wang, 2013).
This study illustrated in an elegant way how cancer therapeu-
tic strategies can be studied in mechanistic terms using a compu-
tational EALmodel. It must be pointed out that this result opened
the door to the rational design of system dynamics cancer thera-
peutical techniques, in contrast to trial and error and reductionist
approaches that have dominated the biomedical field up to now
(Huang and Kauffman, 2013).
1.6.3. A Diffusion Approach to Study the EAL
The three perspectives to study continuous-time stochastic mod-
els of developmental dynamics briefly described above and rep-
resented in Figure 4 have been applied to understanding actual
developmental cases from an EAL point of view. For example,
Villarreal and collaborators recently proposed a procedure to
construct a probabilistic EAL by calculating the probability dis-
tribution of stable gene expression configurations arising from
the topology of a general N-node GRN (Villarreal et al., 2012).
In this approach, the focus of study is the temporal evolution
Frontiers in Genetics | www.frontiersin.org 10 April 2015 | Volume 6 | Article 160
Davila-Velderrain et al. Modeling the epigenetic attractors landscape
of the distribution over state space (Equation 14 and Figure 4B)
starting from a position centered on a specific attractor con-
figuration. Intuitively, the proposed framework predicts how a
cloud of cells distributed over a particular attractor will diffuse
in time to the neighboring regions (attractors) in state space,
given a specific GRN (which constraints the state trajectories).
The method has been applied to the case of early flower mor-
phogenesis (see subsection above); and its behavior, in both wild
type and mutant conditions. The authors recovered patterns that
are in agreement with the temporal developmental pattern of flo-
ral organs attainment in A. thaliana and most flowering species
(Alvarez-Buylla et al., 2008; Villarreal et al., 2012). The AEL per-
spective has recently also given important insights into the prob-
lem of carcinogenesis trough the quantitative implementation of
the molecular–cellular network hypothesis by Ao and co-workers
(for details, see Wang et al., 2014; Zhu et al., 2015).
1.6.4. Cell Fate Decisions in the Human Stem Cell
Landscape
Recently, Li and Wang adopted the diffusion approach to study
a previously published human stem cell developmental network
(see Chang et al., 2011) composed of 52 genes (Li and Wang,
2013). In this study they showed how the three perspectives rep-
resented in Figure 4 can complement each other in the study
of cellular differentiation: (1) through the numerical analysis
of the Langevin-like equations for the complete network they
acquired a landscape directly from the statistics of the trajecto-
ries of the system (Equation 13 and Figure 4A); (2) by means
of approximations they studied the evolution of the probabilistic
distribution and obtained an steady-state distribution (Equation
14 and Figure 4B); and (3) using the path-integral formalism
(Figure 4C) they calculated the dominant paths (Wang et al.,
2011). The obtained paths were interpreted as the biological paths
for differentiation and reprogramming (Li and Wang, 2013). As
Li and Wang showed, from the results of the three perspectives
it is possible to quantitatively describe the underlying EAL. One
then may be interested in how the EAL changes in response to
specific perturbations.
A general question in stem cell research concerns the underly-
ing mechanisms that explain the known reprogramming strate-
gies, which commonly consist on combining perturbations to
specific transcription factors. Li and Wang systematically tested
which genes and regulatory interactions imply the greatest alter-
ations to the quantitative properties of the EAL (e.g., height
values and transition rates) when perturbed. Interestingly, sev-
eral biological observations associated with the manipulation of
the so-called Yamanaka factors (Oct3/4, Sox2, Klf4, c-Myc)—the
transcription factors considered the core regulators in the induc-
tion of pluripotency—were consistent with the observed mod-
eling results. For example, simulated knockdown perturbations
to these factors consistently increased (lowered) the probabil-
ity (height) of the differentiation state. On the other hand, the
path-integral formalism allowed them to show how specific per-
turbations to these factors cause the differentiation process to
be easier or harder in terms of the time spent during transi-
tions and the characteristics of the differentiation paths. Over-
all, this study presented an important contribution toward the
mechanistic, dynamical explanation of the characterized repro-
gramming strategies in terms of the properties of the underlying
EAL.
1.7. Concluding Remarks
An overall strategy for the practical implementation of what
we call EAL models comprises four steps: (1) establishment
of an experimentally grounded GRN; (2) characterization of
the attractor (and quasi-potential) landscape through dynamical
modeling; (3) computational prediction of cell state responses to
specific perturbations; and (4) analysis of the prevailing paths of
cell fate change. The first step (1) is already a well-established
research problem that includes expert curation of experimental
data and/or statistical inference. In this review we focused on
the second step and presented examples of how steps (3) and (4)
can be achieved once a EAL model is effectively constructed. As
shown here, there are several ways to implement an EAL model
starting from a GRN. The specific choice should be made consid-
ering the properties of the network and the associated questions
of interest.
The methodologies reviewed here are mostly well-suited to
approach the problem of differentiation and temporal cell-fate
attainment in a mechanistic setting. The observed behavior
results from constraints given by the joint effect of non-linear
regulatory interactions and the inherent stochasticity prevalent
in GRN. The actual physical implementation of these generic
mechanisms in a multicellular system would necessarily imply
additional sources of constraint and spatially explicit, multi-level
modeling platforms. Tissue-level patterning mechanisms such as
cell-cell interactions; chemical signaling; cellular growth, pro-
liferation, and senescence; in addition to mechanic and elastic
forces at play in cells, tissues and organs, inevitably impose phys-
ical limitations which in turn affect cellular behavior. This would
thus imply non-homogenous GRNs with contrasting additional
chemical and physical constraints, that in a cooperative manner
underlie the emergence of positional information and morpho-
genetic patterns. Given this fact, the next logical step to extend
EAL and associated dynamical models would be to account for
these physical processes in an attempt to understand how cel-
lular decisions occur during tissue patterning and not just in
cell cultures. Although some progress has been presented in this
direction (see, for example Barrio et al., 2010, 2013), the problem
remains largely open, specially in terms of explicitly considering
the constrains imposed by the underlying GRN and EAL.
From a theoretical perspective, a further challenge would be
to carefully evaluate the assumptions implicit in the EAL mod-
els. For example, the adoption of the diffusive perspective briefly
explained above—which is often taken as a standard in stem
cell systems biology—implicitly assumes certain properties about
the forces driving the temporal evolution of the system (Linden-
berg and West, 1990). Are these conditions universally met by
developmental systems? Recent interesting work is starting to
suggest the biological relevance of additional constraints such
as state-dependent fluctuations (Pujadas and Feinberg, 2012;
Weber and Buceta, 2013), as well as time-dependent dynamical
behavior (Mitra et al., 2014; Verd et al., 2014). In both cases,
a dynamically changing EAL is proposed as a potentially more
Frontiers in Genetics | www.frontiersin.org 11 April 2015 | Volume 6 | Article 160
Davila-Velderrain et al. Modeling the epigenetic attractors landscape
accurate description of developmental processes than its static
counterpart.
Overall, the application of the methodologies discussed in this
review to specific developmental processes has shown the practi-
cal relevance of dynamical models consistent with the conceptual
basis of the classical EL and the fundamental role of the con-
straints imposed by the GRN interactions. The different EAL
modeling approaches are useful to answer specific questions and
can complement each other. So far, EAL models have shown
to be an adequate framework for understanding stem cell dif-
ferentiation and reprogramming events in mechanistic terms;
and are also starting to show promise as the basis for rational
cancer therapeutic strategies, as well as other interesting issues
in developmental biology and evolution.
Acknowledgments
This work was supported by grants from CONACYT, Mexico:
240180, 180380, 167705, 152649 to ERA-B; from UNAM-
DGAPA-PAPIIT: IN203113, IN 203214, IN203814 (ERA-B). JD-
V receives a Phd scholarship from CONACYT. The authors
acknowledge the logistical and administrative help of Diana
Romo. The authors acknowledge the Centro de Ciencias de la
Complejidad (C3), UNAM.
References
Albert, R., and Othmer, H. G. (2003). The topology of the regulatory inter-
actions predicts the expression pattern of the segment polarity genes in
Drosophila melanogaster. J. Theor. Biol. 223, 1–18. doi: 10.1016/S0022-5193(03)
00035-3
Allen, L. J. (2010). An Introduction to Stochastic Processes with Applications to
Biology. Boca Raton, FL: CRC Press.
Alvarez-Buylla, E. R., Chaos, A., Aldana, M., Benítez, M., Cortes-Poza, Y.,
Espinosa-Soto, C., et al. (2008). Floral morphogenesis: stochastic explorations
of a gene network epigenetic landscape. PLoS ONE 3:e3626. doi: 10.1371/jour-
nal.pone.0003626
Alvarez-Buylla, E. R., Azpeitia, E., Barrio, R., Benítez, M., and Padilla-Longoria,
P. (2010). From abc genes to regulatory networks, epigenetic landscapes and
flower morphogenesis: making biological sense of theoretical approaches.
Semin. Cell Dev. Biol. 21, 108–117. doi: 10.1016/j.semcdb.2009.11.010
Ao, P., Kwon, C., and Qian, H. (2007). On the existence of potential land-
scape in the evolution of complex systems. Complexity 12, 19–27. doi:
10.1002/cplx.20171
Ao, P., Galas, D., Hood, L., and Zhu, X. (2008). Cancer as robust intrinsic state of
endogenous molecular-cellular network shaped by evolution. Med. Hypotheses
70, 678–684. doi: 10.1016/j.mehy.2007.03.043
Ao, P. (2004). Potential in stochastic differential equations: novel construction. J.
Physics A 37, L25. doi: 10.1088/0305-4470/37/3/L01
Azpeitia, E., Davila-Velderrain, J., Villarreal, C., and Alvarez-Buylla, E. R. (2014).
“Gene regulatory network models for floral organ determination,” in Flower
Development: Methods and Protocols, eds R. José Luis andW. Frank (New York,
NY: Springer), 441–469.
Balázsi, G., van Oudenaarden, A., and Collins, J. J. (2011). Cellular decision mak-
ing and biological noise: from microbes to mammals. Cell 144, 910–925. doi:
10.1016/j.cell.2011.01.030
Barrio, R. Á., Hernandez-MacHado, A., Varea, C., Romero-Arias, J. R., and
Alvarez-Buylla, E. (2010). Flower development as an interplay between dynami-
cal physical fields and genetic networks. PLoS ONE 5:e13523. doi: 10.1371/jour-
nal.pone.0013523
Barrio, R. A., Romero-Arias, J. R., Noguez, M. A., Azpeitia, E., Ortiz-Gutiérrez,
E., Hernández-Hernández, V., et al. (2013). Cell patterns emerge from cou-
pled chemical and physical fields with Cell proliferation dynamics: the Ara-
bidopsis thaliana root as a study system. PLoS Comput. Biol. 9:e1003026. doi:
10.1371/journal.pcbi.1003026
Bhattacharya, S., Zhang, Q., and Andersen, M. E. (2011). A deterministic map of
Waddington’s epigenetic landscape for Cell fate specification. BMC Syst. Biol.
5:85. doi: 10.1186/1752-0509-5-85
Chang, R., Shoemaker, R., and Wang, W. (2011). Systematic search for recipes to
generate induced pluripotent stem cells. PLoS Comput. Biol. 7:e1002300. doi:
10.1371/journal.pcbi.1002300
Choi, M., Shi, J., Jung, S. H., Chen, X., and Cho, K.-H. (2012). Attractor land-
scape analysis reveals feedback loops in the p53 network that control the
cellular response to dna damage. Sci. Signal. 5:ra83. doi: 10.1126/scisignal.
2003363
Ding, S., and Wang, W. (2011). Recipes and mechanisms of cellular reprogram-
ming: a case study on budding yeast Saccharomyces cerevisiae. BMC Syst. Biol.
5:50. doi: 10.1186/1752-0509-5-50
Ebeling, W., and Feistel, R. (2011). Physics of Self-organization and Evolution.
Weinheim: Wiley.com.
Enver, T., Pera, M., Peterson, C., and Andrews, P. W. (2009). Stem cell
states, fates, and the rules of attraction. Cell Stem Cell 4, 387–397. doi:
10.1016/j.stem.2009.04.011
Espinosa-Soto, C., Padilla-Longoria, P., and Alvarez-Buylla, E. R. (2004). A
gene regulatory network model for cell-fate determination during Arabidop-
sis thaliana flower development that is robust and recovers experimen-
tal gene expression profiles. Plant Cell 16, 2923–2939. doi: 10.1105/tpc.104.
021725
Fagan, M. B. (2012). Waddington redux: models and explanation in stem cell and
systems biology. Biol. Philos. 27, 179–213. doi: 10.1007/s10539-011-9294-y
Ferrell, J. E. (2012). Bistability, bifurcations, and Waddington’s epigenetic land-
scape. Curr. Biol. 22, R458–R466. doi: 10.1016/j.cub.2012.03.045
Flöttmann, M., Scharp, T., and Klipp, E. (2012). A stochastic model of epi-
genetic dynamics in somatic cell reprogramming. Front. Physiol. 3:216. doi:
10.3389/fphys.2012.00216
Fuchs, A. (2013a). Nonlinear Dynamics in Complex Systems. Berlin; Heidelberg:
Springer.
Fuchs, C. (2013b). Inference for Diffusion Processes: with Applications in Life
Sciences. Berlin; Heidelberg: Springer.
Furusawa, C., and Kaneko, K. (2012). A dynamical-systems view of stem cell
biology. Science 338, 215–217. doi: 10.1126/science.1224311
Garcia-Ojalvo, J., and Arias, A. M. (2012). Towards a statistical mechan-
ics of cell fate decisions. Curr. Opin. Genet. Dev. 22, 619–626. doi:
10.1016/j.gde.2012.10.004
Gardiner, C. W. (2009). Stochastic Methods. Berlin: Springer.
Garg, A., Mohanram, K., De Micheli, G., and Xenarios, I. (2012). “Implicit meth-
ods for qualitative modeling of gene regulatory networks,” in Gene Regulatory
Networks, eds D. Bart and G. Nele (New York, NY: Springer), 397–443.
Ge, H., and Qian, H. (2012). Landscapes of non-gradient dynamics with-
out detailed balance: stable limit cycles and multiple attractors. Chaos 22,
023140–023140. doi: 10.1063/1.4729137
Gilbert, S. F. (1991). Epigenetic landscaping: Waddington’s use of cell fate bifurca-
tion diagrams. Biol. Philos. 6, 135–154.
González, F., Boué, S., and Belmonte, J. C. I. (2011). Methods for making induced
pluripotent stem cells: reprogramming a la carte. Nat. Rev. Genet. 12, 231–242.
doi: 10.1038/nrg2937
Goodwin, B. C. (1963). Temporal Organization in Cells. A Dynamic Theory of
Cellular Control Processes. London; New York, NY: Academic Press.
Goodwin, B. C. (2001). How the Leopard Changed its Spots: the Evolution of
Complexity. Princeton, NJ: Princeton University Press.
Grafi, G. (2004). How cells dedifferentiate: a lesson from plants.Dev. Biol. 268, 1–6.
doi: 10.1016/j.ydbio.2003.12.027
Graham, T. G., Tabei, S. A., Dinner, A. R., and Rebay, I. (2010). Modeling bistable
cell-fate choices in the Drosophila eye: qualitative and quantitative perspectives.
Development 137, 2265–2278. doi: 10.1242/dev.044826
Frontiers in Genetics | www.frontiersin.org 12 April 2015 | Volume 6 | Article 160
Davila-Velderrain et al. Modeling the epigenetic attractors landscape
Haken, H. (1977). Synergetics. An Introduction. Nonequilibrium Phase Trasitions
and Self-organization in Physics, Chemistry, and Biology. Berlin; Heidelberg;
New York, NY: Springer-Verlag.
Han, B., and Wang, J. (2007). Quantifying robustness and dissipation cost of yeast
Cell cycle network: the funneled energy landscape perspectives. Biophys. J. 92,
3755–3763. doi: 10.1529/biophysj.106.094821
Hoffmann, M., Chang, H. H., Huang, S., Ingber, D. E., Loeffler, M., and Galle, J.
(2008). Noise-driven stem cell and progenitor population dynamics. PLoS ONE
3:e2922. doi: 10.1371/journal.pone.0002922
Hong, T., Xing, J., Li, L., and Tyson, J. (2012). A simple theoretical framework
for understanding heterogeneous differentiation of cd4+ t cells. BMC Syst. Biol.
6:66. doi: 10.1186/1752-0509-6-66
Huang, S., and Kauffman, S. (2009). “Complex gene regulatory networks-from
structure to biological observables: cell fate determination,” in Encyclopedia of
Complexity and Systems Science, ed R. A. Meyers (New York, NY: Springer),
1180–1293.
Huang, S., and Kauffman, S. (2013). How to escape the cancer attractor: ratio-
nale and limitations of multi-target drugs. Semin. Cancer Biol. 23, 270–278. doi:
10.1016/j.semcancer.2013.06.003
Huang, S., Guo, Y.-P., May, G., and Enver, T. (2007). Bifurcation dynamics in
lineage-commitment in bipotent progenitor cells. Dev. Biol. 305, 695–713. doi:
10.1016/j.ydbio.2007.02.036
Huang, S. (2009). Reprogramming cell fates: reconciling rarity with robustness.
Bioessays 31, 546–560. doi: 10.1002/bies.200800189
Huang, S. (2010). Cell lineage determination in state space: a systems view brings
flexibility to dogmatic canonical rules. PLoS Biol. 8:e1000380. doi: 10.1371/jour-
nal.pbio.1000380
Huang, S. (2011). Systems biology of stem cells: three useful perspectives to help
overcome the paradigm of linear pathways. Philos. Trans. R. Soc. B Biol. Sci.
366, 2247–2259. doi: 10.1098/rstb.2011.0008
Huang, S. (2012). The molecular and mathematical basis of waddington’s epi-
genetic landscape: a framework for post-darwinian biology? Bioessays 34,
149–157. doi: 10.1002/bies.201100031
Huang, S. (2013). Genetic and non-genetic instability in tumor progression:
link between the fitness landscape and the epigenetic landscape of can-
cer cells. Cancer Metastasis Rev. 32, 423–448. doi: 10.1007/s10555-013-
9435-7
Jaeger, J., and Crombach, A. (2012). “Lifes attractors,” in Evolutionary Systems
Biology, ed O. S. Soyer (New York, NY: Springer-Verlag), 93–119.
Kauffman, S. A. (1969). Metabolic stability and epigenesis in randomly constructed
genetic nets. J. Theor. Biol. 22, 437–467.
Kauffman, S. (1993). The Origins of Order: Self Organization and Selection in
Evolution. New York, NY: Oxford University Press.
Kwon, C., Ao, P., and Thouless, D. J. (2005). Structure of stochastic dynam-
ics near fixed points. Proc. Natl. Acad. Sci. U.S.A. 102, 13029–13033. doi:
10.1073/pnas.0506347102
Ladewig, J., Koch, P., and Brüstle, O. (2013). Leveling Waddington: the emergence
of direct programming and the loss of cell fate hierarchies. Nat. Rev. Mol. Cell
Biol. 14, 225–236. doi: 10.1038/nrm3543
Lapidus, S., Han, B., and Wang, J. (2008). Intrinsic noise, dissipation cost,
and robustness of cellular networks: the underlying energy landscape of
mapk signal transduction. Proc. Natl. Acad. Sci. U.S.A. 105, 6039–6044. doi:
10.1073/pnas.0708708105
Li, C., and Wang, J. (2013). Quantifying cell fate decisions for differentia-
tion and reprogramming of a human stem cell network: landscape and
biological paths. PLoS Comput. Biol. 9:e1003165. doi: 10.1371/journal.pcbi.
1003165
Li, F., Long, T., Lu, Y., Ouyang, Q., and Tang, C. (2004). The yeast cell-cycle net-
work is robustly designed. Proc. Natl. Acad. Sci. U.S.A. 101, 4781–4786. doi:
10.1073/pnas.0305937101
Lindenberg, K., andWest, B. J. (1990). The Nonequilibrium Statistical Mechanics of
Open and Closed Systems. New York, NY: VCH.
Lv, C., Li, X., Li, F., and Li, T. (2014). Constructing the energy landscape for
genetic switching system driven by intrinsic noise. PLoS ONE 9:e88167. doi:
10.1371/journal.pone.0088167
MacArthur, B. D., Ma’ayan, A., and Lemischka, I. R. (2008). Toward stem cell sys-
tems biology: from molecules to networks and landscapes. Cold Spring Harb.
Symp. Quant. Biol. 73, 211–215. doi: 10.1101/sqb.2008.73.061
MacArthur, B. D., Ma’ayan, A., and Lemischka, I. R. (2009). Systems biology of
stem cell fate and cellular reprogramming.Nat. Rev. Mol. Cell Biol. 10, 672–681.
doi: 10.1038/nrm2766
Mammoto, T., and Ingber, D. E. (2010). Mechanical control of tissue
and organ development. Development 137, 1407–1420. doi: 10.1242/dev.
024166
Mendoza, L., and Alvarez-Buylla, E. R. (1998). Dynamics of the genetic regula-
tory network forArabidopsis thaliana flowermorphogenesis. J. Theor. Biol. 193,
307–319.
Mitra, M. K., Taylor, P. R., Hutchison, C. J., McLeish, T., and Chakrabarti, B.
(2014). Delayed self-regulation and time-dependent chemical drive leads to
novel states in epigenetic landscapes. J. R. Soc. Interface 11, 20140706. doi:
10.1098/rsif.2014.0706
Pujadas, E., and Feinberg, A. P. (2012). Regulated noise in the epige-
netic landscape of development and disease. Cell 148, 1123–1131. doi:
10.1016/j.cell.2012.02.045
Purvis, J. E., Karhohs, K. W., Mock, C., Batchelor, E., Loewer, A., and Lahav, G.
(2012). p53 dynamics control cell fate. Science 336, 1440–1444. doi: 10.1126/sci-
ence.1218351
Qiu, X., Ding, S., and Shi, T. (2012). From understanding the development land-
scape of the canonical fate-switch pair to constructing a dynamic landscape
for two-step neural differentiation. PLoS ONE 7:e49271. doi: 10.1371/jour-
nal.pone.0049271
Risken, H. (1984). Fokker-Planck Equation. Berlin; Heidelberg: Springer.
Roeder, I., and Radtke, F. (2009). Stem cell biology meets systems biology. Devel-
opment 136, 3525–3530. doi: 10.1242/dev.040758
Sciammas, R., Li, Y., Warmflash, A., Song, Y., Dinner, A. R., and Singh, H.
(2011). An incoherent regulatory network architecture that orchestrates b cell
diversification in response to antigen signaling. Mol. Syst. Biol. 7:495. doi:
10.1038/msb.2011.25
Sieweke, M. H. (2015). Waddingtons valleys and captain cooks islands. Cell Stem
Cell 16, 7–8. doi: 10.1016/j.stem.2014.12.009
Slack, J. M. (2002). Conrad hal Waddington: the last renaissance biologist? Nat.
Rev. Genet. 3, 889–895. doi: 10.1038/nrg933
Strogatz, S. (2001). Nonlinear Dynamics and Chaos: with Applications to Physics,
Biology, Chemistry and Engineering. New York, NY: Perseus Books Group.
Takahashi, K., and Yamanaka, S. (2006). Induction of pluripotent stem cells from
mouse embryonic and adult fibroblast cultures by defined factors. Cell 126,
663–676. doi: 10.1016/j.cell.2006.07.024
Takahashi, K., Tanabe, K., Ohnuki, M., Narita, M., Ichisaka, T., Tomoda,
K., et al. (2007). Induction of pluripotent stem cells from adult human
fibroblasts by defined factors. Cell 131, 861–872. doi: 10.1016/j.cell.2007.
11.019
Thompson, E. G., and Galitski, T. (2012). Quantifying and analyzing the network
basis of genetic complexity. PLoS Comput. Biol. 8:e1002583. doi: 10.1371/jour-
nal.pcbi.1002583
Verd, B., Crombach, A., and Jaeger, J. (2014). Classification of transient
behaviours in a time-dependent toggle switch model. BMC Syst. Biol. 8:43. doi:
10.1186/1752-0509-8-43
Villarreal, C., Padilla-Longoria, P., and Alvarez-Buylla, E. R. (2012). General theory
of genotype to phenotype mapping: derivation of epigenetic landscapes from
N-node complex gene regulatory networks. Phys. Rev. Lett. 109:118102. doi:
10.1103/PhysRevLett.109.118102
Von Dassow, G., Meir, E., Munro, E. M., and Odell, G. M. (2000). The segment
polarity network is a robust developmental module. Nature 406, 188–192. doi:
10.1038/35018085
Waddington, C. H. (1957). The Strategy of Genes. London: George Allen & Unwin,
Ltd.
Wang, J., Huang, B., Xia, X., and Sun, Z. (2006). Funneled landscape leads to
robustness of cell networks: yeast cell cycle. PLoS Comput. Biol. 2:e147. doi:
10.1371/journal.pcbi.0020147
Wang, J., Xu, L., and Wang, E. (2008). Potential landscape and flux framework
of nonequilibrium networks: robustness, dissipation, and coherence of bio-
chemical oscillations. Proc. Natl. Acad. Sci. U.S.A. 105, 12271–12276. doi:
10.1073/pnas.0800579105
Wang, J., Xu, L., andWang, E. (2009). Robustness and coherence of a three-protein
circadian oscillator: landscape and flux perspectives. Biophys. J. 97, 3038–3046.
doi: 10.1016/j.bpj.2009.09.021
Frontiers in Genetics | www.frontiersin.org 13 April 2015 | Volume 6 | Article 160
Davila-Velderrain et al. Modeling the epigenetic attractors landscape
Wang, J., Li, C., and Wang, E. (2010a). Potential and flux landscapes quantify the
stability and robustness of budding yeast cell cycle network. Proc. Natl. Acad.
Sci. U.S.A. 107, 8195–8200. doi: 10.1073/pnas.0910331107
Wang, J., Xu, L., Wang, E., and Huang, S. (2010b). The potential landscape of
genetic circuits imposes the arrow of time in stem cell differentiation. Biophys.
J. 99, 29–39. doi: 10.1016/j.bpj.2010.03.058
Wang, J., Zhang, K., Xu, L., and Wang, E. (2011). Quantifying the Waddington
landscape and biological paths for development and differentiation. Proc. Natl.
Acad. Sci. U.S.A. 108, 8257–8262. doi: 10.1073/pnas.1017017108
Wang, G., Zhu, X., Hood, L., and Ao, P. (2013). From phage lambda to human
cancer: endogenous molecular-Cellular network hypothesis. Quant. Biol. 1–18.
doi: 10.1007/s40484-013-0007-1
Wang, G., Zhu, X., Gu, J., and Ao, P. (2014). Quantitative implementation
of the endogenous molecular–cellular network hypothesis in hepatocellular
carcinoma. Interface Focus 4, 20130064. doi: 10.1098/rsfs.2013.0064
Wang, J. (2011). Potential landscape and flux framework of nonequilibrium bio-
logical networks. Annu. Rep. Comput. Chem. 7, 1. doi: 10.1016/B978-0-444-
53835-2.00001-8
Wang, W. (2013). Therapeutic hints from analyzing the attractor landscape of the
p53 regulatory circuit. Sci. Signal. 6, pe5. doi: 10.1126/scisignal.2003820
Weber, M., and Buceta, J. (2013). Stochastic stabilization of phenotypic states: the
genetic bistable switch as a case study. PLoS ONE 8:e73487. doi: 10.1371/jour-
nal.pone.0073487
West-Eberhard, M. J. (2003). Developmental Plasticity and Evolution. New York,
NY: Oxford University Press.
Wio, H. (1999). “Application of path integration to stochastic processes: an intro-
duction,” in Fundamentals and Applications of Complex Systems, ed Nueva (San
Luis: Univ. UN), 253.
Xu, L., Zhang, K., and Wang, J. (2014). Exploring the mechanisms of differen-
tiation, dedifferentiation, reprogramming and transdifferentiation. PLoS ONE
9:e105216. doi: 10.1371/journal.pone.0105216
Yin, L., and Ao, P. (2006). Existence and construction of dynamical potential in
nonequilibrium processes without detailed balance. J. Phys. A 39, 8593. doi:
10.1088/0305-4470/39/27/003
Zhang, B., and Wolynes, P. G. (2014). Stem cell differentiation as a many-
body problem. Proc. Natl. Acad. Sci. U.S.A. 111, 10185–10190. doi:
10.1073/pnas.1408561111
Zhou, J. X., and Huang, S. (2011). Understanding gene circuits at cell-fate
branch points for rational cell reprogramming. Trends Genet. 27, 55–62. doi:
10.1016/j.tig.2010.11.002
Zhou, J. X., Aliyu, M., Aurell, E., and Huang, S. (2012). Quasi-potential land-
scape in complex multi-stable systems. J. R. Soc. Interface 9, 3539–3553. doi:
10.1098/rsif.2012.0434
Zhu, X.-M., Yin, L., Hood, L., and Ao, P. (2004). Calculating biological behaviors
of epigenetic states in the phage λ life cycle. Funct. Integr. Genomics 4, 188–195.
doi: 10.1007/s10142-003-0095-5
Zhu, X., Yuan, R., Hood, L., and Ao, P. (2015). Endogenous molecular-
cellular hierarchical modeling of prostate carcinogenesis uncovers robust struc-
ture. Prog. Biophys. Mol. Biol. 117, 30–42. doi: 10.1016/j.pbiomolbio.2015.
01.004
Conflict of Interest Statement: The authors declare that the research was con-
ducted in the absence of any commercial or financial relationships that could be
construed as a potential conflict of interest.
Copyright © 2015 Davila-Velderrain, Martinez-Garcia and Alvarez-Buylla. This is
an open-access article distributed under the terms of the Creative Commons Attribu-
tion License (CC BY). The use, distribution or reproduction in other forums is per-
mitted, provided the original author(s) or licensor are credited and that the original
publication in this journal is cited, in accordance with accepted academic practice.
No use, distribution or reproduction is permitted which does not comply with these
terms.
Frontiers in Genetics | www.frontiersin.org 14 April 2015 | Volume 6 | Article 160
Chapter 4
Resultados
No research program has sought to determine the implications of adaptive processes
that mold systems with their own inherent order.
— Stuart Kauffman, The Origins of Order (1993)
... all organisms are a mixture of conserved and nonconserved processes (said
otherwise, of unchanging and changing processes), rather than a uniform collection of
processes that change equally in the sources of variation in the course of evolution.
— Kirschner and Gerhart, The Plausibility of Life (2005)
111
Davila-Velderrain et al. BMC Systems Biology  (2015) 9:20 
DOI 10.1186/s12918-015-0166-y
RESEARCH ARTICLE Open Access
Reshaping the epigenetic landscape during
early flower development: induction of
attractor transitions by relative differences in
gene decay rates
Jose Davila-Velderrain1,2, Carlos Villarreal2,3* and Elena R Alvarez-Buylla1,2*
Abstract
Background: Gene regulatory network (GRN) dynamical models are standard systems biology tools for the
mechanistic understanding of developmental processes and are enabling the formalization of the epigenetic
landscape (EL) model.
Methods: In this work we propose a modeling framework which integrates standard mathematical analyses to
extend the simple GRN Boolean model in order to address questions regarding the impact of gene specific
perturbations in cell-fate decisions during development.
Results: We systematically tested the propensity of individual genes to produce qualitative changes to the EL
induced by modification of gene characteristic decay rates reflecting the temporal dynamics of differentiation stimuli.
By applying this approach to the flower specification GRN (FOS-GRN) we uncovered differences in the functional
(dynamical) role of their genes. The observed dynamical behavior correlates with biological observables. We found a
relationship between the propensity of undergoing attractor transitions between attraction basins in the EL and the
direction of differentiation during early flower development - being less likely to induce up-stream attractor transitions
as the course of development progresses. Our model also uncovered a potential mechanism at play during the
transition from EL basins defining inflorescence meristem to those associated to flower organs meristem. Additionally,
our analysis provided a mechanistic interpretation of the homeotic property of the ABC genes, being more likely to
produce both an induced inter-attractor transition and to specify a novel attractor. Finally, we found that there is a
close relationship between a gene’s topological features and its propensity to produce attractor transitions.
Conclusions: The study of how the state-space associated with a dynamical model of a GRN can be restructured by
modulation of genes’ characteristic expression times is an important aid for understanding underlying mechanisms
occurring during development. Our contribution offers a simple framework to approach such problem, as exemplified
here by the case of flower development. Different GRN models and the effect of diverse inductive signals can be
explored within the same framework. We speculate that the dynamical role of specific genes within a GRN, as
uncovered here, might give information about which genes are more likely to link a module to other regulatory
circuits and signaling transduction pathways.
Keywords: Gene regulatory network, Epigenetic landscape, Attractor landscape, Differentiation, Flower
development, Attractor transitions
*Correspondence: carlos@fisica.unam.mx; eabuylla@gmail.com
2 Centro de Ciencias de la Complejidad (C3), Universidad Nacional Autónoma
de México, Cd. Universitaria, 04510 México, D.F., México
1 Instituto de Ecología, Universidad Nacional Autónoma de México, Cd.
Universitaria, 04510 México, D.F., México
Full list of author information is available at the end of the article
© 2015 Davila-Velderrain et al.; licensee BioMed Central. This is an Open Access article distributed under the terms of the Creative
Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and
reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication
waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise
stated.
Davila-Velderrain et al. BMC Systems Biology  (2015) 9:20 Page 2 of 14
Background
The systems perspective to biology has successfully
rephrased long-standing questions in developmental biol-
ogy in terms of the dynamical behavior of molecular
networks [1-4]. A salient example is the increasing use of
gene regulatory network (GRN) models to study cell-fate
specification [5-9]. How can cells with the same genotype
and gene regulatory network in multicellular organisms
attain different cell fates? How are the steady-state gene
expression configurations that characterize each cell-type
attained? Why do we observe certain cellular phenotypes
and not others? How are the temporal and spatial pat-
terns of cell-fate decisions established and how are they
robustly maintained? The dynamical analysis of GRNs
has given insights into these and other important ques-
tions concerning cell differentiation and morphogenesis,
the two components of development. In short, GRNmod-
els are showing how observed differentiation patterns can
be understood in mechanistic terms [10]. Overall, exper-
imentally grounded GRN models constitute multistable
dynamical systems able to recover stable steady states (or
attractors) corresponding to fixed profiles of gene activa-
tion that mimic those characterizing different cell types in
both plants and animals (e.g., [11,12]). Such profiles are
commonly interpreted as cell fates [1,4,13].
The first, and arguably the simplest, model of GRN
dynamics is the Boolean network model proposed by
Stuart Kauffman [14]. This model is based on strong
assumptions, mainly: (1) gene activity shows binary
(on/off ) behavior; (2) the temporal change in gene activ-
ity occurs in discrete, regular steps; and, originally, (3) the
activity state of the whole network evolves in a synchro-
nized manner [15]. Albeit highly abstract at first sight, the
applicability of Boolean GRNs, as well as derived concep-
tual implications, have been supported extensively both
by experimental observations [5,16,17] and by theoreti-
cal GRNs grounded on experimental data [11,18]. A first
example of the latter was proposed to understand cell-
fate attainment during early flower development [19]. The
Boolean GRN model has become a well established mod-
eling tool in systems biology that is intuitive and attractive
to biologists [20,21].
In addition, simple GRN dynamical models are enabling
the formalization of old biology metaphors such as the
conceptual model of the epigenetic landscape (EL) pro-
posed by C.H. Waddington in 1950s [22-25]. In modern
post-genomic biology the EL has been consolidated as the
preferred conceptual framework for the discussion of the
mechanistic basis underlying cellular differentiation and
plasticity [26-28]. A formal basis for this metaphorical EL
is being developed in the context of GRNs [24,29-32]. The
key for this formalization is to consider that, as well as
generating the cellular phenotypic sates (attractors), the
GRN dynamics also partitions the whole state-space –
the abstract space containing all the possible states of a
given system – in specific regions restricting the trajecto-
ries from one state to another one. The formalization of
the EL in this context is conceptually straightforward: the
number, depth, width, and relative position of the attrac-
tor’s basins of attraction would correspond to the hills
and valleys of the metaphorical EL [24]. Here, we refer
to the structured order of the basins in state-space as the
attractors landscape (AL). For our purposes, the charac-
terization of an AL would correspond, in practical terms,
to the characterization of an EL (see below). There is an
increasing interest to model the EL associated with a GRN
[9,24,30,33-37].
Despite developments in both the conceptual and tech-
nical aspects of GRN modeling, interest in novel ques-
tions associated with developmental cell plasticity calls
for extended modeling frameworks. For example, pre-
vious modeling approaches are not able to address the
importance of quantitative alterations of the GRN compo-
nents in attractors (cell-fates) attainment and transitions,
or the importance of particular GRN components in mov-
ing the system from a particular steady-state or cell fate
to another one. In an attempt to contribute to such a
need, in this work we propose a modeling framework
that integrates standard dynamical systems analyses to
extend the simple GRN Booleanmodel in order to address
questions regarding the impact of gene specific pertur-
bations in cell-fate decisions during development. Two
different, non-exclusive, approaches are commonly fol-
lowed in the study of GRN developmental dynamics: (1)
analyzing a large set of randomly (or exhaustively) assem-
bled networks (see, for example [38-40]); or (2) focusing
on one, well-characterized and experimentally grounded
GRN [11,18]. In this work we adopt the second approach.
One of the first GRN models, which is experimentally
grounded and has been extensively validated and used to
test different approaches, is the floral organ specification
GRN (FOS-GRN). The GRNmodel proposes a regulatory
module underlying floral organ determination in Ara-
bidopsis thaliana during early stages of flower develop-
ment [11,19,41]. The network is grounded in experimental
data for 15 genes and their interactions. Among the 15
genes, five are grouped into three classes (A-type, B-type,
and C- type), whose combinations have been shown -
through molecular developmental genetic studies - to be
necessary for floral organ cell specification. A-type genes
(AP1 and AP2) are required for sepal identity, A-type
together with B-type (AP3 and PI) for petal identity, B-
type and C-type (AGAMOUS) for stamen identity, and
the C-type gene (AG) alone for carpel primordia cell
identity. The so-called ABC model describes such combi-
natorial activities during floral organ determination [42].
The original Boolean FOS-GRN converges to ten attrac-
tors that correspond to the main cell types observed
Davila-Velderrain et al. BMC Systems Biology  (2015) 9:20 Page 3 of 14
during early flower development, and thus provided a
mechanistic explanation to the ABC model. Six attractors
correspond to sepal (Sep), petal (Pt1 and Pt2), stamen (St1
and St2), and carpel (Car) primordial cells within flower
meristems with the expected ABC gene combinations for
each floral organ primordi. In addition it explained the
configurations that characterize the inflorescence meris-
tem: four attractors correspond to meristematic cells of
the inflorescence, which is partitioned into four regions
(Inf1, Inf2, Inf3, and Inf4). This network has become one
of the prototypical systems for theoretical analyses of cell
differentiation and morphogenesis [43], and it has been
shown to be well-suited to explore new questions and
propose new methodologies.
For example, recently an EL model for flower develop-
ment based on a continues stochastic approximation of
the Boolean GRN showed that characteristic multigene
configurations emerge from the constraints imposed by
the GRN; but the temporal pattern of cell transitions also
seems to depend on the asymmetry in gene expression
times-scales for some of the main regulators [33]. Based
on this work, it was suggested that parameters represent-
ing finer regulatory processes, such as gene expression
decay rates, enable richer and more accurate descrip-
tions of the underlying cellular transitions. Specifically,
the results suggested that relative differences in the decay
rates of particular genes may be important for the estab-
lishment of the robust pattern of differentiation transition
observed during floral organ determination. Thus, along
with the constraints imposed by the GRN, a hierarchy
of decay times of gene expression may define alternative
routes to cell fates [21,33]. This possibility has not been
studied systematically yet and it might prove crucial to
undertand how such GRN modules are connected to sig-
nal transduction pathways that alter cell-fate attainment
patterns.
Given the background exposed above, a first ques-
tion concerns the systematic exploration of the effect of a
hierarchy of gene expression times on cell-fate specifica-
tion during early flower development. On the other hand,
flower developmental mechanisms have been shown to
result largely from the global self-organizational proper-
ties of the FOS-GRN; yet, it has not been straightforward
to establish differences in the functional (dynamical) role
of individual genes within the network. Therefore, a sec-
ond question concerns whether by analyzing gene dynamics
we can test if there are such differences and, if so, if
they correlate with biological observables. Given that both
questions require modeling exercises that go beyond a
simple Boolean GRN model, in this contribution we first
propose a modeling framework to extend the Boolean
FOS-GRN model to a continuous system, and then show
how it can be used to explore the questions addressed
here.
For the sake of concreteness, we frame the questions
in the context of the dynamics of early flower devel-
opment as follows: (1) We define the propensity of the
Boolean stationary gene configuration to be transformed
by changes of particular gene parameters as a proxy for
gene functional role. (2) We test as a control parame-
ter the genes characteristic decay rate in order to further
explore the hypothesis raised in [33], that differences in
gene decay rates may potentially guide cell-fate decisions
during flower development. (3) We contrast the dynam-
ical/biological classification with the known experimen-
tal data regarding the role of the ABC genes. In other
words, we functionally classify the genes in the network by
exploring their propensity to produce qualitative changes
in the AL that would ultimately lead to cell-fate decisions
(i.e., attractor transitions). We also analyze the robustness
of each attractor by means of their propensity (or lack
thereof) to undergo such induced transitions. We hypoth-
esize that there is a relationship between the impact of
specific genes in the dynamics of the whole GRN, their
biological function, and the observed hierarchy of differ-
entiation events during early flower development.
Overall, this work constitutes a first step towards
the dynamical, mechanistic characterization of the main
molecular regulators of flower development; and provides
a general methodological framework to approach simi-
lar questions in other developmental processes. It also
provides hypotheses concerning which genes within the
FOS-GRN are more likely to link this module to other
regulatory circuits and signaling transduction pathways
which might be crucial for the temporal progression of
flower development. In conclusion, the approach put for-
ward here allows analyses of the role of the genes’ decay
rates in modifying the AL and thus affecting cell-fate
transitions or patterning.
Methods
Modeling framework
The scope of biological questions that Boolean GRNmod-
els are suited to address can be expanded. Here we focus
on two specific questions that are important for develop-
mental biology andwhich cannot be addressed by Boolean
models – as originally proposed. (1) Although gene knock-
out or over-expression experiments are straightforward
to simulate using a Boolean model, the richness of gene
interactions may be more thoroughly explored by consid-
ering the intertwined dynamics of differentiation stimuli
(microambient alterations, chemical signaling, catalytic
reactions, etc.) and gene characteristic expression times
which determine the developmental process itself, and
which are not easily taken into account in a Boolean
approach due to the absence of genes’ specific param-
eters. (2) It is not straightforward to study potential
transition events among the already characterized stable
Davila-Velderrain et al. BMC Systems Biology  (2015) 9:20 Page 4 of 14
cellular phenotypes with the Boolean deterministic for-
malism. With this limitations in mind, here we propose
a novel modeling framework as an extension of the orig-
inal Boolean GRN model. Our goal was to devise an
extended methodology able to circumvent these limita-
tions while maintaining the simplicity and clarity of the
Boolean model. The proposed framework includes the
following steps (see Figure 1): (1) the characterization of
the dynamical behavior of an experimentally grounded
Boolean GRN - and its associated AL, (2) the transfor-
mation of the Boolean model into a system of ordinary
differential equations (ODEs) with an equivalent AL, (3)
an attractor-wise, gene-wise numerical bifurcation anal-
ysis using the characteristic decay rate of each gene as
a control parameter [43,44], and (4) the classification of
genes into groups according to their propensity to induce
qualitative changes to the AL and their potential to cause
specific transitions between attractors.
Boolean GRNmodel
A Boolean network is a dynamical model with discrete
time and discrete state variables. This can be expressed
formally as:
xi(t + 1) = Fi(x1(t), x2(t), . . . , xk(t)), (1)
where the set of functions Fi are logical prepositions (or
truth tables) expressing the relationship between a gene
i and its k regulators, and where the state variables xi(t)
can take the discrete values 1 or 0 indicating whether
the gene i is expressed or not at a certain time t, respec-
tively. An experimentally grounded Boolean GRN model
is completely specified by the set of genes proposed to be
involved in the process of interest and the associated set
of logical functions derived from experimental data [21].
The set of logical functions for the FOS-GRN used in
this study is included in Additional file 1. The dynami-
cal analysis of the Boolean network model was conducted
using the package BoolNet [45] within the R statistical
programming environment (www.R-project.org).
Continuous GRNmodel
In order to characterize qualitative changes in the dynam-
ics of the GRN under continuous variations of a given
parameter (here a gene’s decay rate) we study a continuous
representation of the discrete Boolean dynamics. Several
approaches have been used to describe a Boolean GRN as
a continuous system [21,33,46,47]. Here we adopt a system
of ODEs of the form:
dxi
dt
= [ fi(x1, x2, . . . , xk)]−kixi, (2)
where ki represents the expression decay rate of the gene
i of the GRN. The function fi results from performing a
transformation to the corresponding boolean function Fi
following the rules:
Figure 1 Schematic representations of the modeling methodology. a) The starting point is an experimentally grounded and dynamically
characterized GRN Boolean model. Here the FOS-GRN is used, which recovers ten fixed-point attractors representing the cell-types observed during
early flower development. b) The Boolean model is transformed into an equivalent continuous dynamical model. A set of rules is applied to the
logical propositions of the Boolean model in order to derive a logic-based ODE model in continuous state-space. c) An attractor-wise, gene-wise
numerical bifurcation analysis is performed. Because of qualitative changes to the AL induced by increasing parameter values several basins of
attraction may merge into one, causing an inevitable cell-fate decision (i.e., an attractor transition).
Davila-Velderrain et al. BMC Systems Biology  (2015) 9:20 Page 5 of 14
xi(t) ∧ xj(t) → xi(t) . xj(t),
xi(t) ∨ xj(t) → xi(t) + xj(t) − xi(t) . xj(t),
¬xi(t) → 1 − xi(t).
(3)
Following [21,33] we consider that the input-response
function associated to each gene displays a saturation
behavior characterized by a logistic function. In this case,
the input associated with the gene i takes the form:
[ fi(x1, x2, . . . , xk)]=
1
1 + exp[−b[ fi(x1, x2, . . . , xk) − ǫ] ]
,
(4)
where ǫ is a threshold level (usually ǫ = 1/2), and b the
input saturation rate. For b >> 1, the input function dis-
plays dichotomic behavior. A stationary state is defined by
dxi/dt = 0, so that Eq.(2) yields
xsi =
1
ki

[
fi
(
xs1, x
s
2, . . . , x
s
k
)]
, (5)
where xsi denotes the stationary value.We observe that the
expression level of the GRN node i is inversely propor-
tional to its decay rate, so that for a fast decay rate ki ≫ 1
the expression level xsi → 0, while for a slow decay ki ≪ 1,
xsi ≫ 1. Thus, a hierarchy in gene decay rates determines
a pattern of relative gene expression levels.
The obtained system of ODEs is included on
Additional file 1. Similar logic-based ODE models have
been presented before (see, for example [48,49]). The
numerical analysis of the system of ODEs was conducted
using inhouse R code exploiting the functions provided in
the packages deSolve [50] and rootSolve [50], as described
in [51]. During preliminary simulation experiments we
observed that under the specified parameter values the
uncovered fixed-point attractors always showed extreme
values – i.e., close to either 0 or 1, but not to 0.5.
Attractors landscape operational definition
The Attractors Landscape (AL) is specified by the exhaus-
tive characterization of the state-space. We operationally
define the AL as the data structure containing two ele-
ments: (1) a 2n×n state-spacematrix, a matrix whose rows
correspond to each of the 2n possible states of a Boolean
GRN; and (2) a vector of length 2n whose elements take
values Ai from the set {1, . . . ,An} where An is the number
of attractors of a given Boolean Network. This structure
thusmaps each state to its corresponding attractor. For the
case of the ODEsmodel, the obtained attractor states were
discretized in order to have a direct comparison with the
Boolean model. Following [52] an unsupervised k-means
clustering algorithm [53] with two clusters (i.e., k = 2)
corresponding to the two binary values was used for the
discretization task (for details see [52]).
Bifurcation analysis
All bifurcation analyses were conducted numerically using
the following algorithm. A specific attractor is taken as
an initial condition in an ODEs initial-value problem. For
each active gene in the attractor state: (1) an ordered
set of values for the control parameter (here the gene’s
decay rate ki) is chosen – while the rest of the parameters
are kept constant; (2) the ODEs are solved numerically
until reaching an steady state, each time using a different
parameter value, and for all the parameter values in the
set; and (3) a plot is generated with parameter values in
the x-axis and the total sum y of the single gene expression
values for the n genes (i.e., y =
∑n
i=1 x
∗
i ) of the obtained
steady state x∗
i in the y-axis. The analysis is performed for
each attractor. Qualitative changes are identified by the
occurrence of sudden jumps in the bifurcation graphs.
Data analysis
Network topology
For each gene (node) in the FOS-GRN the following mea-
sures of topological importance were calculated: degree
(number of nodes it is connected to), in-degree (number
of connections directed towards it), out-degree (num-
ber of connections directed towards other nodes), and
betweenness (fraction of all shortest paths that pass
through it). All network topological computations were
conducted using the igraph package [54]. In order to
test for the association of the genes propensity to pro-
duce AL qualitative changes and their topological features
within the network, simple linear regression models were
fitted using the calculated propensity of each gene to pro-
duce a qualitative change as response variable and each
topological feature as predictor.
To test whether interacting genes in the FOS-GRN have
a related propensity to produce AL alterations in response
to an increase in their decay rate. The average absolute
difference of the value of the calculated gene sensitiv-
ity between interacting components in the network was
calculated and then used as a statistic in a simulation
(sampling) procedure in order to assess how frequently it
is expected to observe this or a smaller value in an ensem-
ble of similar but random networks. Specifically, 100,000
networks each with the same number of nodes and inter-
actions were generated, and the statistic was calculated
for each of these networks. The estimated distribution of
the statistic over the ensemble of networks was then used
to calculate the probability of observing a value equal or
smaller than that calculated in the FOS-GRN.
Results
Dynamical analysis of the GRN
The GRN underlying early flower development (refered to
as FOS-GRN) was used as a study case. The most recent
version reported in [33] was used. The corresponding
Davila-Velderrain et al. BMC Systems Biology  (2015) 9:20 Page 6 of 14
logical update rules are reported in Additional file 1. The
first task was to characterize the GRN dynamical behavior
and its associated AL. The global dynamical behavior of
the network was analyzed by the exhaustive characteriza-
tion of all steady states using all possible initial conditions.
Specifically, we calculated its attractor states and their
corresponding basins of attraction. We arranged both ini-
tial conditions and corresponding attractor into an AL
structure (see methods). As expected, the network recov-
ered 10 fixed-point attractors: four corresponding to the
four regions of the inflorescence meristem (Inf1, Inf2,
Inf3, and Inf4), and six to the four floral organ primor-
dial cells within the flower primordia (Sep, Pt1, Pt2, St1,
St2, and Car). The two attractors corresponding to petals
(Pt1 and Pt2) are identical except for the state of activation
of the UFO gene, and the same holds for the two stamen
attractors (St1 and St2). The attractors and its basins are
reported in Additional file 1. We then transformed the
Boolean network into a system of ODEs (see Methods).
A series of studies have extensively validated the
Boolean FOS-GRN model in terms of increasingly avail-
able experimental data; for example, it has been shown
that its dynamical behavior is robust enough as to pre-
dict the experimentally induced phenotypes in several
mutant conditions [11,19,24,55]. In order to preserve such
validated behavior we derived a ODEs model preserving
the attractors and basins of attraction uncovered in the
Boolean case. The input-response function included in the
proposed continuous model contains 2 parameters: b, and
ǫ. The value of the parameter b was chosen as the smallest
integer value able to recover the same fixed-point attrac-
tors and their basins of the Boolean model. We tested a
range of values b = i for [ 1, ..i.., 40]. We found that a value
of b ≥ 5 is able to recover the same attractors and basin
sizes that the ones uncovered with the Boolean model. We
use a value of b = 5 for all the following calculations. The
ǫ parameter is a threshold level, for simplicity a value of
ǫ = 0.5 was used. For this first analysis the decay param-
eter for each gene was set to ki = 1. The 10 attractors
obtained with these settings, and its basins size are shown
in Additional file 1. Thus, we derived two dynamical mod-
els for the FOS-GRNwith an equivalent behavior in terms
of the uncovered attractors and basins of attraction. We
specified an AL structure for each model.
Bifurcation analysis
We performed a numerical analysis in order to explore
the propensity of single genes to qualitatively change the
attractor states where they are expressed (and thus induce
attractor transitions in the AL) in response to an increase
in their decay rate parameter (see Methods). To illustrate
our analyses, we generated a set of graphs, one per each
gene expressed in each attractor. In the graph we plot-
ted the initial attractor state and its progressive change
resulting from altering the decay parameter ki. If m genes
were active in the attractor in question, the analysis was
conducted for each gene i for i =[ 1, . . . ,m]. We per-
formed the analysis to each attractor j for j =[ 1, . . . , 10].
Figure 2 shows the graphs obtained for the genes corre-
sponding to carpels (Car) attractor. In this case, only the
genes AG and LFY were able to induce an phase tran-
sition. Whereas gene AG produces a transition between
already characterized attractor states (Car → Sep), the
change in LFY produces a new attractor state. The graphs
for all the attractors (and their genes) are reported in
Additional file 2. We found that for each attractor at least
one of its expressed genes is able to produce a qualita-
tive change to the AL. Some genes (attractors) are more
likely to produce (undergo) attractor transitions. These
results suggest that, by systematically testing the potential
of altering specific genes qualitatively changes the GRN
underlying AL, we can uncover differences in the genes
functional (dynamical) role in the overall system under
analysis.
Gene classes
In order to have a better understanding of the nature of
the uncovered differential functional (dynamical) role of
genes, we classified the genes according to their propen-
sity to induce attractor transitions. Table A1, in Additional
file 1 summarizes the result of all the bifurcation analyses.
For each attractor, and for each perturbed gene, we regis-
tered whether a qualitative change is produced or not, and
the final attractor attained after the simulated change. In
order to numerically express the propensity of each gene
to induce qualitative changes, we counted the number of
times a gene is able to produce a qualitative change and
normalized this number by the number of times the gene
is expressed among the 10 attractors. The resulting scale is
shown in Figure 3. We will refer to this quantified propen-
sity to induce qualitative changes (phase transitions) as the
metric PT. In order to classify a gene with either high or
low propensity, we clustered the genes described by the
quantified propensity PT in two groups using the k-means
clustering algorithm [56]. According to this analysis, the
genes with higher propensity are: UFO, AP1, WUS, AG,
TFL1, EMF1, and LFY (see Figure 3). On the other hand,
genes were also classified depending on whether or not,
when they induce a qualitative change, are able to induce a
transition between already characterized attractor states.
The genes found to be able to produce this type of tran-
sitions are: UFO, AP1, WUS, AP3, AG, TFL1, EMF1, and
PI. Additionally, we also classified the genes depending
on whether or not they are able to produce new attractor
states after the qualitative change. The genes that show
this behavior are: SEP, AP2, PI, LFY. The three classes are
shown in Table 1. In Figure 4 we map to each node in the
graph of the GRN its corresponding metric PT.
Davila-Velderrain et al. BMC Systems Biology  (2015) 9:20 Page 7 of 14
Figure 2 Bifurcation diagrams. Graphs obtained as a result of the Bifurcation analysis performed for the genes corresponding to carpels (Car)
attractor. The genes AG and LFY were able to induce qualitative changes. The gene AG produced a transition between already characterized
attractor states (Car → Sep). The change in LFY produced a new attractor state.
Analysis of the classes of genes
In order to test if there is evidence of an association
between the differential functional role of genes and
background biological knowledge, we compared the rep-
resentation of the ABC genes and the additional (non-
ABC) genes of the FOS-GRN within each of the classes
described in the previous subsection, and listed these in
Table 1. We followed two procedures: (1) calculated the
gene frequency of each biological group (e.g. ABC, or
Additional) within each gene class, (2) perform a hyper-
geometric test for biological group over-representation.
Figure 5a shows the results. We found the following pat-
terns. In the classes defined by the gene propensity to
induce qualitative changes, there is a lower (higher) rep-
resentation of ABC genes in the high (low) propensity
class with respect to the other additional genes. On the
other hand, in the classes defined by the gene capacity to
produce attractor transitions between known or unknown
attractors, there is a higher representation of ABC genes
with respect to the other additional genes in both classes.
These results suggest that ABC genes are less likely to pro-
duce qualitative changes in the AL by induced changes
in their expression dynamics - at least under a relatively
higher decay rate as tested here - than the non-ABC genes
in the network. On the other hand, if such a qualitative
change occurs, ABC genes are more likely to both induce
inter-attractor transitions and to specify novel attractors
than the non-ABC genes in the network. These seem-
ingly contradictory results can be understood by taking
into consideration the relative robustness of the differ-
ent attractors against such parameter perturbations (see
below).
Attractors propensity to undergo transitions
Taking in consideration that not all the genes are
expressed in all the attractors, we also compared the
propensity of the different attractors to undergo attractor
transitions by calculating the frequency of attractor tran-
sitions per attractor as the number of undergone attractor
transitions normalized by the number of genes expressed
in the respective attractor. The results are shown in
Figure 5b. For this analysis we mapped all the states in the
AL corresponding to any of the four inflorescence attrac-
tors (Inf1, Inf2, Inf3, and Inf4) into a single Inf attractor.
We also mapped the states of the attractors (St1, St2) and
(Pt1, Pt2) to the individual attractors St and Pt, respec-
tively. Hence, the system had a total of five attractors.
We found that the inflorescence attractor is the attractor
with the highest propensity. Specifically, a relatively higher
decay rate of any of the genes expressed in the inflores-
cence attractors (TFL1, EMF1, UFO, WUS) with respect
to the other genes always produces an attractor transition.
Davila-Velderrain et al. BMC Systems Biology  (2015) 9:20 Page 8 of 14
Figure 3 PT metric values. The plot shows propensity to induce phase transitions quantified for each gene. The horizontal line divides the genes
into groups of higher (above) of lower (below) propensity. The two groups are based on a clustering analysis.
Three of the flower attractors (Car, Sep, St) show a fre-
quency of attractor transitions lower than 0.5, while the
remaining flower attractor (Pe) shows a frequency ∼ 0.6.
These results suggest a relationship between the propen-
sity of undergoing attractor transitions and the direction
of differentiation during early flower development - being
less likely to induce attractor transitions as the course
of development progresses, or to produce a reprogram-
ming from a floral organ attractor to an inflorescence one.
Interestingly, attractors propensity to undergo attractor
transitions do not correlate with the attractors basin sizes
(see Figure 5b), as intuitively expected.
Table 1 Gene classes according to their propensity to
produce qualitative changes to the attractors
Classification Genes
High propensity genes UFO, AP1, WUS, AG, TFL1, EMF1, LFY
Low propensity genes SEP, FT, AP3, AP2, PI, FUL
Genes causing transition UFO, AP1, WUS, AP3, AG, TFL1, EMF1, PI
between known attractors
Genes causing transition SEP, AP2, PI, LFY
between unknown attractors
Genes propensity to produce qualitative changes and
network structure
Given that it is common to provide evidence of the gene
importance in the context of networks by considering
only each gene’s topological features [57], we tested if the
gene’s propensity to produce qualitative changes to the
AL as defined here is correlated with topological proper-
ties. Specifically, we tested an association between each
of genes topological features and the quantified gene’s
propensity of producing a qualitative change to the AL
(PT metric) by performing linear regression analyses. We
characterized each node by a set of network topologi-
cal features, which express numerically the placement of
each gene within the network. For each gene (node) in
the FOS-GRN we calculated two commonly used mea-
sures of topological importance: degree (number of nodes
it is connected to), and betweenness (fraction of all short-
est paths that pass through it). We also considered that
the dynamical behavior of the GRN is associated with the
type of interactions within the network, thus we specified
further the degree feature into in-degree or out-degree.
Interestingly, we found a significant relationship between
PT metric and two predictor variables: out-degree and
betweenness (p-value = 0.03). In Figure 4 we represent
Davila-Velderrain et al. BMC Systems Biology  (2015) 9:20 Page 9 of 14
a b
Figure 4 The FOS Gene regulatory network. The graph represents the mapping of the calculated PT values with the topological features
out-degree (a) and betweenness (b) into the graph of the FOS-GRN. The size of the nodes represents the PT values. The topological features are
represented by a graded yellow-red color scale with yellow (red) in the left (right) extreme.
graphically the associations by mapping the PT values
and the topological features out-degree (Figure 4a) and
betweenness (Figure 4b) into the graph of the FOS-GRN.
The size of the nodes represents the PT values in the
scale [ 0, . . . , 1]. The topological feature is represented by
a graded yellow-red color scale with yellow (red) in the left
(right) extreme [ 0, . . . , 1].
Similarity in the propensity of interacting genes
To further test if there is an association between gene’s
topological features and their propensity to produce qual-
itative changes in the attractors, we performed the fol-
lowing analysis. Given the PT values for each gene, we
asked if interacting genes within the FOS-GRN share
more similar propensity within themselves than with non-
interacting components. This pattern, if found, would
suggest a close relationship between network architecture
and such gene’s dynamical property. Similar analyses have
been proposed in network-based molecular evolutionary
studies as a test for an association between network struc-
ture and evolutionary constraint [58,59]. In order to test
whether this pattern is present in the FOS-GRN we cal-
culated the average absolute difference (AAD) of the PT
value between interacting components in the networks
and used it as an statistic. An AAD of PT of 0.333 was
calculated for the FOS-GRN. We then tested how likely is
this value to be explained by change alone; specifically, we
generated a null distribution by calculating AAD values in
an ensemble of similar but random networks. We include
the histogram of the corresponding statistic on an ensem-
ble of 100,000 random networks with the same number
of nodes and interactions in Additional file 1. Based on
this data we estimated the probability of observing such
a small value by calculating the fraction of random net-
works showing an AAD value AAD ≤ 0.333 or greater.
The resulting probability was 0.06.
Taken together these results: (1) a significant relation-
ship between PT metric and the topological features of
out-degree and betweeness, and (2) a marginally signifi-
cant (p-value∼ 0.06) similar propensity within interacting
genes; support the hypothesis that there is a close rela-
tionship between a gene’s placement in the network, or its
micro-topological position within a GRN, and its propen-
sity to produce qualitative changes to the AL – at least
in the case of the FOS-GRN. More general analyses for
GRNwith different topologies and architectures should be
done.
Discussion
Recently, several authors have considered the restruc-
turing of the state-space associated with a dynamical
model of a GRN as an important aid for understanding
Davila-Velderrain et al. BMC Systems Biology  (2015) 9:20 Page 10 of 14
a b
c
Figure 5 Genes propensity and functional class. a) The plot shows the gene frequency of each functional group (i.e., ABC, or Additional) within
each gene class (i.e., High Propensity Genes, Low Propensity Genes, Genes Causing Transition Between Known Attractors, Genes Causing Transition
Between Unknown Attractors). The star sign represents gene group over-representation as defined by a lower p-value relative to the other gene
functional class calculated with a hypergeometric test (see Methods). b) The plot shows the calculated attractors propensity to undergo attractor
transitions. c) Attractors basin size plot.
underlying mechanisms occurring during development
an evolution [5,32,60-65]. A conclusion is emerging: the
model of a landscape changing over time seems plausible
as an explanation for fundamental features of morpho-
genesis and tissue formation [13]. In general, however,
most work in this regard has been centered around either
conceptual discussions or the dynamical analyses of small
gene circuits. The exploration of such questions in larger,
multi-attractor GRNs, that are grounded on experimen-
tal data and underlie realistic cases of cell differentiation,
and in which the state-space presents a more complex
structure, has largely been left behind. Here we present
a modeling framework of general applicability as a first
step for such type of exploration. For the sake of concrete-
ness, we used as a model GRN the specific case of the
FOS-GRN.
ODE-based models allows more flexible choice of net-
work parameters reflecting, for example, different inter-
action strengths or inductive signals. Analyses of math-
ematical models of differentiation dynamics have shown
that the considerations of such flexibility may be impor-
tant to understand and control cell-fate choices (see, for
example [5,9]). In the present case, given the hypothesis
raised by some of the authors in [33] that differences in
gene decay rates may potentially guide cell-fate decisions
during flower development; we focus exclusively on the
impact of relative gene decay rates in restructuring the
AL, and thus we limit the scope of our conclusions. Addi-
tionally, the specific biological mechanisms driving such
differential expression dynamics in vivo are not known.
We speculate that signaling modules regulating responses
to environmental cues may be directly connected to some
of the components included in the GRN module analyzed
here. In this direction, some of the authors have recently
started to characterized such integrated GRNs consider-
ing the relevance of light sensing in flowering develop-
mental choices [66]. Future work will test the effect of
coupling such signaling modules with the GRN analyzed
herein on the structure of the AL.
In the present case, when a given gene’s decay rate
is tuned and crosses a threshold, we observe qualitative
changes in the AL’s organization. We refer to the different
patterns of organization as phases. The study of complex
systems is, to a large extent, a search for the principles per-
vading self-organized, emergent phenomena and defining
its potential phases [43,67]. Following this complex sys-
tems perspective, in this work we thus explored the phase
changes in the AL that emerge from the dynamics of an
experimentally grounded, complex GRN. Such transition
phenomena are collective by nature and result from inter-
actions taking place among the interacting genes of the
GRN and not by any single gene alone. In any case, our
exploration helped uncover a differential role of individual
genes regarding their propensity to produce these induced
phase transitions.
Given that the observed phase changes effectively cor-
respond to qualitative changes of the AL in which one
or more of the attractors (cell states) disappear, the result
Davila-Velderrain et al. BMC Systems Biology  (2015) 9:20 Page 11 of 14
would inevitably lead to an induced cell-fate decision.
We focus on these latter attractors transitions. We must
point out that in the present case we study the induced
qualitative changes of the AL indirectly by systematically
analyzing the local effects on each attractor of quantitative
changes in gene decay rates. The relative stability of each
attractor’s basin is expected to be relevant in constrain-
ing transitions among attractors. This latter problem is the
subject of current intense research and is more naturally
approached by using stochastic models (see, for example
[34,68]).
Differences in decay rates may also be interpreted as
different time-scale regimes. Interestingly, a recent study
stressed the relevance of time delays arising from multi-
step chemical reactions or cellular shape transformations
[69]. Specifically, the authors argue in this reference that
such feature is crucial in understanding cell differentia-
tion, as it leads to novel states in epigenetic landscapes.
In the present case, we indeed found that relatively differ-
ent gene time-scale regimes produce qualitative changes
to the otherwise static AL. Unlike the generic model pre-
sented by Mitra and collaborators [69], however, here we
studied the dynamical behavior of specific genes which
have been extensively characterized experimentally dur-
ing decades of plant developmental genetics studies (see,
for example [2]).
Most studies on the molecular basis of floral develop-
ment focus on the eukaryotic MADS-box gene family,
particularly floral homeotic genes such as AGAMOUS
(AG), APETALA3 (AP3), PISTILLATA (PI), and several
AGAMOUS-like genes [70]. Such genes are also the most
important constituents of the ABCmodel for flower organ
specification described above. Although based on exten-
sive experimentation, the ABC genes have been charac-
terized as having a prominent, functional role in cell fate
and organ type specification during early flower devel-
opment yielding homeotic transformations among floral
organ when mutated; it was only a mechanistic view, the
FOS-GRN dynamical model, which provided a sufficient
explanation for the empirically observed ABC patterns
– i.e., the combinatorial ABC code and the stable gene
expression configurations observed during early flower
development in Arabidopsis [2,11,19]. This model has
been studied from different perspectives [24,33,41].
When testing the coherence of experimental data
regarding the role of these molecular regulators under
the framework of a GRN dynamical model certain ques-
tions arise. Why the ABC genes and not the other genes
in the network display homeotic mutations when they are
inactivated? Is there a relationship with this characterized
biological (functional) property and its dynamical behav-
ior within the FOS-GRN? What genes are more prone to
have a stronger influence on the dynamical behavior of the
whole system, and thus the phenotype, when perturbed
or coupled with other circuits, signaling mechanisms, or
processes outside the GRN module? Here we present a
methodological framework for systematically testing the
potential of specific genes when perturbed to produce
qualitative changes to the underlying AL. By applying
this approach to the FOS-GRN we uncover differences in
the functional (dynamical) role of their genes. We specu-
late that such dynamical behavior might give information
about which genes are most likely to be links with other
circuits and processes.
A somewhat unexpected result is that the homeotic
genes are less likely to produce attractor transitions in
the AL by an induced higher decay rate, in comparison
to other non-ABC genes in the network (see Methods).
However, if we consider that ABC genes specify floral
organ identity, a late process in early flower development,
a higher robustness to non genetic perturbations such as
changes in gene expression parameters is consistent with
an increased stability of the cellular phenotypes as devel-
opment proceeds. Indeed, when analyzing the propensity
of the different attractors to undergo attractor transitions
(see Methods) we found that the attractors corresponding
to the flower cell-types show a lower propensity that the
Inflorescence attractors (see below). On the other hand,
we also found that in the cases where a phase transition
induced by higher decay rates of ABC genes relative to the
rates of other genes, the output is more likely to produce
both an induced inter-attractor transition and to specify a
novel attractor. This result aligns well with the empirical
status of the ABC genes as homeotic genes, as it suggests
that higher enough perturbations slowing gene function
that approach a loss-of-function mutation, eliminate or
produce specific cellular phenotypes, that correspond to
changes of attractors, and thus homeotic alterations.
In Alvarez-Buylla and collaborators [24] some of the
authors proposed a mechanistic explanation for the
stereotypical temporal pattern of cell-fate specification
during early flower development by means of noise-
induced attractors transitions. In that study, however, it
was shown that stochasticity alone was not able to explain
a transition from the inflorescence to the flower meris-
tems (attractors), an early, well-characterized event during
flower development. Thus the authors speculate on the
role of non-random inductive signals in the transition
from cell fates in the inflorescence meristem to those in
the flower meristem [24]. Our results suggest that this
indeed could be the case, as a relatively higher decay
rate of any of the genes expressed in the inflorescence
attractors (TFL1, EMF1, UFO,WUS), with respect to the
other genes, always produces a phase transition, and this
transitions predominantly lead to flower organ attrac-
tors (see results). Thus, our model uncovered a potential
mechanism which could be subjected to experimental val-
idation. Namely, TFL1, EMF1, UFO, orWUS genes have a
Davila-Velderrain et al. BMC Systems Biology  (2015) 9:20 Page 12 of 14
relatively higher gene decay rate relative to flower specifi-
cation genes during early flower development and within
the inflorescence meristem. This feature in turn facilitates
the inflorescence-flower transition when these genes are
altered in their decay rates, thus suggesting that signals or
pathways at play during the transition from inflorescence
to flower meristem should interact or affect decay rates of
these genes. In contrast, most functional studies concern-
ing inflorescence to flower transition, have mostly focused
on LFY and also on AP1 [71,72].
The distinction between molecular network structure
and function is a core problem in systems biology. Dynam-
ical GRN models enable a rigorous distinction between
structure (topology) and function (dynamics). In a recent
molecular evolutionary study also using the FOS-GRN, it
was suggested that the dynamical functional role of genes
within the network, and not just its connectivity, could
play an important role in constraining evolution [59]. Such
hypothesis implies a close relationship between network
structure and function. Based on our operational defini-
tion of the gene functional role as the gene’s propensity to
produce AL attractor transitions, we asked if this property
is associated with the gene’s network topological fea-
tures. We found that a significant correlation among these
two. Our results thus support the hypothesis that for the
FOS-GRN there is a close relationship between a gene’s
placement in the network and its propensity to produce
attractor transitions in the AL. Likewise our results also
provide partial support for the dynamical functional role
of genes being important for constraining evolutionary
changes.
Conclusions
In this contribution we present a methodology of gen-
eral applicability as a first step for exploring the restruc-
turing of the state-space associated with a dynamical
multi-attractor GRN model. The framework consists on
systematically exploring the propensity of single genes
to produce qualitative changes in the AL as a result of
changes in their parameters. Importantly, different GRN
models and the effect of general inductive signals can
be explored within the same framework. We showed
how biological insights can be derived by applying the
methodological framework to a single well-characterized
and experimentally groundedGRN: the FOS-GRN. Future
studies should explore if the results derived for this GRN
can be generalized to GRN with contrasting typologies
and architectures.
We systematically explored the effect of relative differ-
ences in gene decay rates on AL structure, and showed
that by analyzing gene dynamics we can test if there
are differences in the functional (dynamical) role among
individual genes within the network, and that such dif-
ferences correlate with biological observables. Specifically,
(1) the dynamical behavior of ABC genes provide both
robustness and flexibility in response to parameter per-
turbations, and are prone to both produce inter-attractor
transitions and specify novel attractors; (2) It is less likely
to induce attractor transitions as the course of develop-
ment progresses; (3) non-random inductive signals may
be at play in the transition from cell fates in the inflores-
cence meristem to those in the flower meristem; and (4)
for the FOS-GRN there is a close relationship between a
gene’s placement in the network and its dynamical role.
Taking together, our results suggest that there is a relation-
ship between the impact of specific genes in the dynamics
of the whole FOS-GRN, their biological function, and the
observed hierarchy of differentiation events during early
flower development.
Additional files
Additional file 1: Supporting Figures and Tables. The Additional file 1
includes the following information: Figure A1. Boolean FOS-GRN logical
update rules. Figure A2. Attractors of the Wild-type Boolean FOS-GRN.
Figure A3. ODEs model of the FOS-GRN. Figure A4. Attractors of the
Wild-type ODEs FOS-GRN Model. Figure A5. Comparison of the Attractors
and Basins Uncovered with the Boolean and ODEs FOS-GRN Models. Table
A1. Table summarizing the result of all the bifurcation analyses. Figure A6.
Histogram of the average absolute difference in PT values calculated from
simulated networks values.
Additional file 2: Results of the Bifurcation (Phase transition)
Analysis.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
ERAB coordinated the study and with the other authors established the overall
logic and core questions to be addressed. All three authors conceived and
planned the modeling approaches, JDV established many of the specific
analyses to be done, recovered the information from the literature to establish
the model, programmed and ran all the modeling and analyses. All authors
participated in the interpretation of the results and analyses. JDV wrote most
of the paper with help from ERAB and inputs from CV. All authors proofread
the final version of the ms submitted. All authors read and approved the final
manuscript.
Acknowledgements
Jose Davila-Velderrain acknowledges support from the graduate program
Doctorado en Ciencias Biomédicas, Universidad Nacional Autónoma de
México and CONACyT for financial support. This work is presented in partial
fulfillment towards his doctoral degree in this Program. This work was
supported by grants CONACyT:180380 (CV and ERAB), 180098 (ERAB);
UNAM-DGAPA-PAPIIT: IN203113; IN204011; IN226510-3 and IN203814 to ERAB.
We acknowledge Diana Romo for her help in many logistical tasks.
Author details
1 Instituto de Ecología, Universidad Nacional Autónoma de México, Cd.
Universitaria, 04510 México, D.F., México. 2Centro de Ciencias de la
Complejidad (C3), Universidad Nacional Autónoma de México, Cd.
Universitaria, 04510 México, D.F., México. 3 Instituto de Física, Universidad
Nacional Autónoma de México, Cd. Universitaria, 04510 México, D.F., México.
Received: 19 December 2014 Accepted: 22 April 2015
Davila-Velderrain et al. BMC Systems Biology  (2015) 9:20 Page 13 of 14
References
1. Alvarez-Buylla aER, Benítez M, Dávila EB, Chaos A, Espinosa-Soto C,
Padilla-Longoria P. Gene regulatory network models for plant
development. Curr Opin Plant Biol. 2007;10(1):83–91.
2. Alvarez-Buylla ER, Azpeitia E, Barrio R, Benítez M, Padilla-Longoria P.
From abc genes to regulatory networks, epigenetic landscapes and
flower morphogenesis: making biological sense of theoretical
approaches. Seminars Cell Dev Biol. 2010;21(1):108–17.
3. Furusawa C, Kaneko K. A dynamical-systems view of stem cell biology.
Science. 2012;338(6104):215–7.
4. Huang S, Kauffman S. Complex gene regulatory networks-from structure
to biological observables: cell fate determination In: RA M, editor.
Encyclopedia of Complexity and Systems Science. New York: Springer;
2009. p. 1180–293.
5. Huang S, Guo Y-P, May G, Enver T. Bifurcation dynamics in
lineage-commitment in bipotent progenitor cells. Dev Biol. 2007;305(2):
695–713.
6. Huang S. Cell lineage determination in state space: a systems view brings
flexibility to dogmatic canonical rules. PLoS Biol. 2010;8(5):1000380.
7. Andrecut M, Halley JD, Winkler DA, Huang S. A general model for binary
cell fate decision gene circuits with degeneracy: indeterminacy and
switch behavior in the absence of cooperativity. PloS One. 2011;6(5):
19358.
8. Zhou JX, Brusch L, Huang S. Predicting pancreas cell fate decisions and
reprogramming with a hierarchical multi-attractor model. PloS One.
2011;6(3):14752.
9. Li C, Wang J. Quantifying cell fate decisions for differentiation and
reprogramming of a human stem cell network: landscape and biological
paths. PLoS Comput Biol. 2013;9(8):1003165.
10. Jaeger J, Sharpe J. On the concept of mechanism in development. In:
Towards a Theory of Development. Oxford: Oxford University Press; 2014.
p. 56.
11. Espinosa-Soto C, Padilla-Longoria P, Alvarez-Buylla ER. A gene regulatory
network model for cell-fate determination during arabidopsis thaliana
flower development that is robust and recovers experimental gene
expression profiles. Plant Cell Online. 2004;16(11):2923–39.
12. Von Dassow G, Meir E, Munro EM, Odell GM. The segment polarity
network is a robust developmental module. Nature. 2000;406(6792):
188–92.
13. Kaneko K. Characterization of stem cells and cancer cells on the basis of
gene expression profile stability, plasticity, and robustness. Bioessays.
2011;33(6):403–13.
14. Kauffman SA. Metabolic stability and epigenesis in randomly constructed
genetic nets. J Theor Biol. 1969;22(3):437–67.
15. Kauffman SA. The Origins of Order: Self-organization and Selection in
Evolution. New York: Oxford university press; 1993.
16. Huang S, Eichler G, Bar-Yam Y, Ingber DE. Cell fates as high-dimensional
attractor states of a complex gene regulatory network. Phys Rev Lett.
2005;94(12):128701.
17. Chang HH, Hemberg M, Barahona M, Ingber DE, Huang S.
Transcriptome-wide noise controls lineage choice in mammalian
progenitor cells. Nature. 2008;453(7194):544–7.
18. Albert R, Othmer HG. The topology of the regulatory interactions
predicts the expression pattern of the segment polarity genes in
drosophila melanogaster. J Theor Biol. 2003;223(1):1–18.
19. Mendoza L, Alvarez-Buylla ER. Dynamics of the genetic regulatory
network for arabidopsis thaliana flower morphogenesis. J Theor Biol.
1998;193(2):307–19.
20. Albert I, Thakar J, Li S, Zhang R, Albert R. Boolean network simulations
for life scientists. Source Code Biol Med. 2008;3(1):1–8.
21. Azpeitia E, Davila-Velderrain J, Villarreal C, Alvarez-Buylla ER. Gene
regulatory network models for floral organ determination. In: Flower
Development. New York: Springer; 2014. p. 441–69.
22. Waddington CH. The Strategy of Genes. London: George Allen & Unwin,
Ltd.; 1957.
23. Siegal ML, Bergman A. Waddington’s canalization revisited:
developmental stability and evolution. Proc Nat Acad Sci. 2002;99(16):
10528–32.
24. Álvarez-Buylla ER, Chaos Á, Aldana M, Benítez M, Cortes-Poza Y,
Espinosa-Soto C, et al. Floral morphogenesis: stochastic explorations of a
gene network epigenetic landscape. Plos One. 2008;3(11):3626.
25. Wang J, Zhang K, Xu L, Wang E. Quantifying the waddington landscape
and biological paths for development and differentiation. Proc Nat Acad
Sci. 2011;108(20):8257–62.
26. Enver T, Pera M, Peterson C, Andrews PW. Stem cell states, fates, and the
rules of attraction. Cell Stem Cell. 2009;4(5):387–97.
27. Fagan MB. Waddington redux: models and explanation in stem cell and
systems biology. Biol Philosophy. 2012;27(2):179–213.
28. Ladewig J, Koch P, Brüstle O. Leveling waddington: the emergence of
direct programming and the loss of cell fate hierarchies. Nat Rev Mol Cell
Biol. 2013;14(4):225–36.
29. Huang S. The molecular and mathematical basis of waddington’s
epigenetic landscape: A framework for post-darwinian biology?
Bioessays. 2012;34(2):149–57.
30. Davila-Velderrain J, Martinez-Garcia J, Alvarez-Buylla ER. Modeling the
epigenetic attractors landscape: towards a post-genomic mechanistic
understanding of development. Front Genet. 2015;6:160.
31. Zhou JX, Qiu X, d’Herouel AF, Huang S. Discrete gene network models
for understanding multicellularity and cell reprogramming: From network
structure to attractor landscapes landscape. In: Computational Systems
Biology. CA: Elsevier; 2014. p. 241–76.
32. Davila-Velderrain J, Alvarez-Buylla ER. Bridging genotype and phenotype.
In: Frontiers in Ecology, Evolution and Complexity. CopIt ArXives; 2014.
p. 144-154.
33. Villarreal C, Padilla-Longoria P, Alvarez-Buylla ER. General theory of
genotype to phenotype mapping: derivation of epigenetic landscapes
from n-node complex gene regulatory networks. Phys Rev Lett.
2012;109(11):118102.
34. Zhou JX, Aliyu M, Aurell E, Huang S. Quasi-potential landscape in
complex multi-stable systems. J R Soc Interface. 2012;9(77):3539–53.
35. Choi M, Shi J, Jung SH, Chen X, Cho K-H. Attractor landscape analysis
reveals feedback loops in the p53 network that control the cellular
response to dna damage. Sci Signaling. 2012;5(251):83.
36. Wang P, Song C, Zhang H, Wu Z, Tian X-J, Xing J. Epigenetic state
network approach for describing cell phenotypic transitions. Interface
Focus. 2014;4(3):20130068.
37. Lu M, Onuchic J, Ben-Jacob E. Construction of an effective landscape for
multistate genetic switches. Phys Rev Lett. 2014;113(7):078102.
38. Fujimoto K, Ishihara S, Kaneko K. Network evolution of body plans.
PLoS One. 2008;3(7):2772.
39. Suzuki N, Furusawa C, Kaneko K. Oscillatory protein expression dynamics
endows stem cells with robust differentiation potential. PLoS One.
2011;6(11):27232.
40. Cotterell J, Sharpe J. Mechanistic explanations for restricted evolutionary
paths that emerge from gene regulatory networks. PloS one. 2013;8(4):
61178.
41. Sanchez-Corrales Y-E, Alvarez-Buylla ER, Mendoza L. The arabidopsis
thaliana flower organ specification gene regulatory network determines a
robust differentiation process. J Theor Biol. 2010;264(3):971–83.
42. Coen ES, Meyerowitz EM. The war of the whorls: genetic interactions
controlling flower development. Nature. 1991;353(6339):31–7.
43. Sole R. Phase Transitions. New Jersey: Princeton U. Press; 2011.
44. Seydel R. Practical Bifurcation and Stability Analysis. New York: Springer;
2010.
45. Müssel C, Hopfensitz M, Kestler HA. Boolnet–an r package for generation,
reconstruction and analysis of boolean networks. Bioinformatics.
2010;26(10):1378–80.
46. Glass L. Classification of biological networks by their qualitative dynamics.
J Theor Biol. 1975;54(1):85–107.
47. Mendoza L, Xenarios I. A method for the generation of standardized
qualitative dynamical systems of regulatory networks. Theor Biol Med
Modell. 2006;3(1):13.
48. Mangan S, Alon U. Structure and function of the feed-forward loop
network motif. Proc Nat Acad Sci. 2003;100(21):11980–5.
49. Lu M, Jolly MK, Gomoto R, Huang B, Onuchic J, Ben-Jacob E. Tristability
in cancer-associated microrna-tf chimera toggle switch. J Phys Chem B.
2013;117(42):13164–74.
50. Soetaert K, Petzoldt T, Setzer RW. Solving differential equations in r:
Package desolve. J Stat Software. 2010;33(9):1–25.
51. Soetaert K, Cash J, Mazzia F. Solving Differential Equations in R. New York:
Springer; 2012.
Davila-Velderrain et al. BMC Systems Biology  (2015) 9:20 Page 14 of 14
52. Shmulevich I, Kauffman SA, Aldana M. Eukaryotic cells are dynamically
ordered or critical but not chaotic. Proc Nat Acad Sci USA. 2005;102(38):
13439–44.
53. James G, Witten D, Hastie T, Tibshirani R. An Introduction to Statistical
Learning. New York: Springer; 2013.
54. Csardi G, Nepusz T. The igraph software package for complex network
research. InterJournal Complex Syst. 2006;1695(5):1-9.
55. Barrio R. Á, Hernandez-Machado A, Varea C, Romero-Arias JR,
Alvarez-Buylla E. Flower development as an interplay between dynamical
physical fields and genetic networks. PloS one. 2010;5(10):13523.
56. Everitt B, Hothorn T. An Introduction to Applied Multivariate Analysis
with R. New York: Springer; 2011.
57. Yu H, Huang J, Zhang W, Han J-DJ. Network analysis to interpret
complex phenotypes. In: Applied Statistics for Network Biology: Methods
in Systems Biology. Germany: Wiley Online Library; 2011. p. 1–12.
58. Alvarez-Ponce D, Aguadé M, Rozas J. Network-level molecular
evolutionary analysis of the insulin/tor signal transduction pathway
across 12 drosophila genomes. Genome Res. 2009;19(2):234–42.
59. Davila-Velderrain J, Servin-Marquez A, Alvarez-Buylla ER. Molecular
evolution constraints in the floral organ specification gene regulatory
network module across 18 angiosperm genomes. Mol Biol Evol.
2014;31(3):560–73.
60. Pujadas E, Feinberg AP. Regulated noise in the epigenetic landscape of
development and disease. Cell. 2012;148(6):1123–31.
61. Ferrell Jr JE. Bistability, bifurcations, and waddington’s epigenetic
landscape. Curr Biol. 2012;22(11):458–66.
62. Wang J, Xu L, Wang E, Huang S. The potential landscape of genetic
circuits imposes the arrow of time in stem cell differentiation. Biophys J.
2010;99(1):29–39.
63. Jaeger J, Crombach A. Life’s attractors. In: Evolutionary Systems Biology.
New York: Springer; 2012. p. 93–119.
64. Jaeger J, Irons D, Monk N. The inheritance of process: a dynamical
systems approach. J Environ Zool Part B: Mol Dev Evol. 2012;318(8):
591–612.
65. Verd B, Crombach A, Jaeger J. Classification of transient behaviours in a
time-dependent toggle switch model. BMC Syst Biol. 2014;8(1):43.
66. Pérez-Ruiz RV, García-Ponce B, Marsch-Martínez N, Ugartechea-Chirino
Y, Villajuana-Bonequi M, de Folter S, et al. XAANTAL2 (AGL14) is an
important component of the complex gene regulatory network that
underlies arabidopsis shoot apical meristem transitions. Mol Plant. 2015.
doi:10.1016/j.molp.2015.01.017.
67. Haken H. Synergetics. New York: Springer; 1977.
68. Ge H, Qian H. Landscapes of non-gradient dynamics without detailed
balance: Stable limit cycles and multiple attractors. Chaos:
Interdisciplinary J Nonlinear Sci. 2012;22(2):023140.
69. Mitra MK, Taylor PR, Hutchison CJ, McLeish T, Chakrabarti B. Delayed
self-regulation and time-dependent chemical drive leads to novel states
in epigenetic landscapes. J R Soc Interface. 2014;11(100):20140706.
70. Lawton-Rauh AL, Alvarez-Buylla ER, Purugganan MD. Molecular
evolution of flower development. Trends Ecol Evol. 2000;15(4):144–9.
71. Mandel MA, Yanofsky MF. A gene triggering flower formation in
arabidopsis. Nature. 1995;377(6549):522–4.
72. Benlloch R, Berbel A, Serrano-Mislata A, Madueño F. Floral initiation and
inflorescence architecture: a comparative view. Ann Bot. 2007;100(3):
659–76.
Submit your next manuscript to BioMed Central
and take full advantage of: 
• Convenient online submission
• Thorough peer review
• No space constraints or color figure charges
• Immediate publication on acceptance
• Inclusion in PubMed, CAS, Scopus and Google Scholar
• Research which is freely available for redistribution
Submit your manuscript at 
www.biomedcentral.com/submit
1
Dynamic network and epigenetic landscape model of a1
regulatory core underlying spontaneous immortalization and2
epithelial carcinogenesis3
Méndez-López LF1,2,†, Davila-Velderrain J1,2,†, Enŕıquez-Olgúın C3, Martinez-Garcia JC3,∗,4
Alvarez-Buylla ER1,2,∗
5
1 Instituto de Ecoloǵıa, Universidad Nacional Autónoma de México, Cd. Universitaria,6
México, D.F. 04510, México7
2 Centro de Ciencias de la Complejidad (C3), Universidad Nacional Autónoma de México,8
Cd. Universitaria, México, D.F. 04510, México9
3 Departamento de Control Automático, Cinvestav-IPN, A. P. 14-740, 07300 México, DF,10
México11
∗ Corresponding authors: juancarlos martinez-garcia@conciliencia.org,12
eabuylla@gmail.com13
† These authors contributed equally to this work14
Abstract15
Tumorigenic transformation of human epithelial cells in vitro has been described experimentally as the16
potential result of a process known as spontaneous immortalization. In this process a generic series of17
cell–state transitions occur in which normal epithelial cells acquire a senescent state, later surpassed to18
attain a mesenchymal state and finally a mesenchymal stem–like phenotype, with a potential tumorigenic19
behavior. In this paper we integrate published data on the molecular components and interactions20
that have been described as key regulators of such cell states and transitions. Such large network,21
that is provided, is then reduced with the aim of recovering a minimal regulatory core incorporating22
the necessary and sufficient restrictions to recover the observed cell states and their generic progression23
patterns in epithelial–mesenchymal transition. Data is formalized into logical regulatory rules that govern24
the dynamics of each of the networks components as a function of the states of its regulators. The25
proposed core gene regulatory network attains only three steady–state gene expression configurations26
that correspond to the profiles characteristic of normal epithelial, senescent, and mesenchymal stem–like27
cells. Interestingly, epigenetic analyses of the uncovered network shows that it also recovers the generic28
time–ordered transitions documented during tumorigenic transformation in vitro of epithelial cells, and29
which strongly correlate with the patterns observed during the progressive pathological description of30
epithelial carcinogenesis in vivo.31
Introduction32
Nearly 84% of cancers diagnosed in human adults are carcinomas (i.e., cancer of epithelial origin), and33
their emergence is strongly associated with both an underlying chronic inflammatory process and with34
aging [1]. The precise role and the contribution of these two processes to the origin, progression, and35
detected clinic behavior of epithelial cancers remains elusive, however. The current general assump-36
tion is that aging and inflammation increase the chance of accumulating somatic mutations, and this37
genetic instability ultimately leads to carcinoma. However, this view does not offer a logical or mecha-38
nistic explanation for well–documented observations. For example: (1) cancer cells show morphological39
and transcriptional convergences despite their diverse origin, (2) carcinogenesis recapitulates embryonic40
processes, (3) cancer behavior can be acquired in the absence of mutations through trans– or dediffer-41
entiation, and (4) cancer cells can be “normalized” by several experimental means [2–5]. Moreover, it42
is well–known that different carcinomas share the same cellular processes and histological stages or pro-43
gression patterns, as well as robust associations with lifestyle factors [6]. These empirical observations44
2
suggests that, in analogy to normal development, the human genome is associated with an underlying45
robust mechanism restricting cell states and temporal progression patterns that are characteristic of ep-46
ithelial carcinogenesis. In accordance with this view, other researchers have previously proposed that47
cancer can be considered a developmental disease [7, 8].48
In systems biology it is common to understand both cell differentiation and development in terms49
of dynamical systems theory. In this framework, the genome of a cell is directly mapped into a global50
and multi–stable gene regulatory network (GRN) whose dynamics yields several (quasi)stationary and51
stable distinct phenotypic cellular states [9–14]. That is, the same genome robustly generates multiple52
discrete cellular phenotypes through developmental dynamics [12,15,16]. These stable phenotypic states53
are called attractors and correspond to configurations of gene or protein activation states that underlie the54
cellular fates or phenotypes – i.e., which thus constitute biological observables. Therefore, developmental55
processes – cellular differentiation events in particular – are formalized in temporal terms as attractor’s56
(i.e., cell states) transitions. Here we adopt such approach to study the cell states attained and the57
time–ordered transitions observed during the tumorigenic transformation of epithelial cells cultured in58
vitro that surpass a senescent state; a process known as spontaneous immortalization.59
Experimental findings in molecular and cell biology of cancer research have revealed that it is pos-60
sible to recover cells with cancer–like phenotypes through some specific cellular transitions. This has61
been shown particularly in carcinomas [3, 17–19]. By a cellular transition we refer to a differentiation62
event in which a certain cell acquires a discretely different cellular phenotype. For example, the process63
called epithelial–mesenchymal transition (EMT) comprises a stereotypical cell state transition in which64
epithelial cells exposed, for example, to cytokines, are induced to undergo a discrete phenotypic change65
acquiring a mesenchymal phenotype [17,19]. Interestingly, through inflammation–induced EMT epithelial66
cells surpass senescence, and undergo spontaneous immortalization. Cells that emerge from this process67
manifest mesenchymal stem–like properties and are capable of developing cancer in murine models [3,18].68
Furthermore, these cells are difficult to distinguish phenotypically and in terms of the transcription fac-69
tors that they express from either the so–called cancer stem cells (also known as tumor initiating cells)70
or from embryonic stem cells [20, 21].71
In the present work, we hypothesize that a generic series of cell state transitions widely observed and72
robustly induced by inflammation in cell cultures during spontaneous immortalization naturally result73
from the self–organized behavior emerging from an underlying intracellular GRN. During this process,74
normal epithelial cells first acquire a senescent state, to finally attain a mesenchymal stem–like cellular75
state with a potential tumorigenic behavior. We speculate that tissue–level conditions associated with76
a bad prognosis, such as a pro–inflammatory milieu, may increase the rate of occurrence of these same77
transitions in vivo promoting as a result the emergence and progression of epithelial cancer.78
In an attempt to provide mechanistic insights into the regulation of the aforementioned observed cell–79
fates specification, as well as the time–ordered cell–state transitions, we propose here a cellular level GRN80
model that integrates the available experimental data concerning the main molecular components and81
interactions related to the emergence and progression of carcinomas. We propose a large GRN of 41 nodes82
that integrates cellular processes thoroughly studied experimentally, but which have not been integrated83
before into a single GRN. Specifically, the large GRN model includes key molecular regulators that: (1)84
characterize the cellular phenotypes of epithelial, mesenchymal, and senescent cells; (2) are involved in85
the induction of the cellular processes of replicative senescence, cellular inflammation, and EMT; and (3)86
characterize the phenotypic changes undergone by cells emerging from these processes (i.e. mesenchymal87
stem–like cells). To obtain a minimal regulatory core for further dynamical analyses we formally reduced88
the large GRN. We show that the proposed regulatory core module displays an orchestrating robust be-89
havior akin to that seen in other developmental regulatory modules previously characterized with similar90
formal approaches (see, for example [9, 10, 22, 23]). Specifically, by proposing logical functions grounded91
on experimental data for this regulatory core module and by analyzing its behavior following conventional92
Boolean GRN dynamical approaches, we show that the uncovered minimal GRN converges only to three93
3
attractors. The uncovered states correspond to the expected gene expression configurations that have94
been observed for normal epithelial, senescent and stem–like mesenchymal cellular fates. Additionally, we95
also explore the GRN Epigenetic Landscape using a stochastic version of the model (following: [24, 25])96
in order to address if the proposed GRN also restricts or underlies the generic temporal sequence with97
which cell states occur in cell cultures and which correlate with observed patterns of cell–type enrichment98
during pathological descriptions of carcinoma progression.99
Results100
Gene Regulatory Network Construction101
Following a bottom–up and an expert knowledge approach we propose a set of cellular dynamical pro-102
cesses ubiquitous to epithelial carcinogenesis, namely: replicative cellular senescence, inflammation, and103
epithelial–mesenchymal transition (EMT). The cellular phenotypes epithelial, senescent, and mesenchy-104
mal cell–types – as well as a mesenchymal embryonic–like state; have been largely characterized as105
biological observables involved in such processes. We provide further definitions of these – and associated106
– phenotypes and processes in our complementary Text S1. We take this information as a methodolog-107
ical basis to integrate a generic dynamical network model of epithelial carcinogenesis. As a first step in108
network integration, based on an extensive literature search (see Methods and Text S1), we assembled a109
set of transcription factors (TFs) and additional molecules involved in the establishment and regulation110
of these cellular states and processes. Subsequently, we manually retrieved documented regulatory inter-111
actions among the molecules, considering only those supported by experimental evidence. For a detailed112
description of the published information for each interaction proposed see Text S1. The constructed large113
GRN is shown in Figure 1 (see Methods). TFs are represented in graphical terms by squares and the114
rest of the molecules by circles. The identified large network consists of 41 nodes and 97 interactions; it115
includes 12 TFs which can be considered as key regulators of the processes under consideration. Colors116
indicate the association that each node hold with specific cellular phenotypes or processes being consid-117
ered: epithelial (green), mesenchymal (orange), inflammation (red), senescence and DNA damage (blue),118
cell–cycle (purple), and polycomb complex (yellow).119
The Proposed Network is Enriched with Cancer Pathways120
In order to provide additional partial support for the association of the bio-molecular set of regulatory121
interactions that we have manually curated based on published data with the processes under consider-122
ation, as well as with carcinoma, we performed a network–based gene set enrichment analysis (GSEA)123
(see Methods). Among the 13 pathways or processes reported as significant when taking the KEGG124
database as a reference, 9 (69%) correspond to cancer pathways, namely: Bladder cancer, Chronic125
myeloid leukemia, Pancreatic cancer, Glioma, Non–small cell lung cancer, Melanoma, Small cell lung126
cancer, Prostate cancer, and Thyroid cancer – note that 6 (66.6%) of these correspond to carcinomas.127
On the other hand, when taking the GO Biological Process database as reference, among the significant128
results we found: replicative senescence, cellular senescence, cell aging, activation of NF–κB–inducing129
kinase activity, determination of adult life span, epithelial cell differentiation, and positive regulation of130
NF–κB transcription factor activity (see Table 1). Using network topological gene set analysis (see Meth-131
ods) we found that, in addition to pathway enrichment, the topological signature of the molecules in the132
proposed network also shows a topological signature that is similar to the one shown by reference cancer133
pathways included in the KEGG database (see Figure S1). These results provide partial support for134
the proposed molecular players: given the current state of knowledge according to annotated databases,135
the set of molecules manually included in the proposed large network seems to be representative of the136
cellular phenotypes and processes considered as prior biological knowledge in our model. In addition,137
4
the molecular components included in the proposed large network are tightly associated with reference138
pathways of epithelial cancers.139
A Core Regulatory Network Module Underlying Spontaneous Immortalization140
We performed a knowledge–based network reduction of the large GRN in Figure 1 in order to derive a141
smaller, core GRN module for which both a topology and architecture with fully defined logical func-142
tions could be established, and which could also be analyzed as a dynamical system (see Methods). In143
addition, such regulatory core should comprise the necessary and sufficient set of nodes and interactions144
that integrate the processes involved in the large network and that could explain, at least in part, the145
restricted set of the cell–states and time–ordered transitions among them during spontaneous immortal-146
ization and epithelial cancer emergence/progression. We were able to define a set of molecular species147
whose regulatory hierarchy, activity, and expression define the identity of the phenotypes of epithelial,148
mesenchymal, and senescent cells. We also converged to, and included, main regulators of replicative149
cellular senescence, inflammation–induced EMT, and determinants of an induced mesenchymal stem–like150
phenotype. Hence, after reduction we obtained a core GRN consisting of only 9 nodes: ESE–2, Snai2,151
NF–κB, E2F, p53, p16, Rb, Cyclin, and Telomerase. Figure 1b shows the proposed core regulatory152
module (colored nodes) in the context of the larger proposed network. For details on how these 9 nodes153
were selected over the rest of the nodes see Text S1. In what follows we present a brief description of154
the nodes included in the reduced GRN, as well as some of the key molecular mechanism encoded in155
the regulatory logic. Although many of the nodes that are included in this regulatory core module have156
been thoroughly studied experimentally and in terms of their involvement in different types of cancer,157
the architecture and topology of the proposed regulatory core module is novel.158
ESE–2 represents the activity of the TFs ESE–1, ESE–2, and ESE–3 (also known as ESX, E74–like159
factor 5, and EHF; respectively) – for a table with synonyms Table E1 in supplementary file. These160
proteins belong to the subgroup ESE (i.e. epithelium–specific) of the TF family ETS. ESE–2161
promotes its own expression and the expression of the other ESE TFs [26–28]. On the other hand,162
ESE–2 represses Snai2 – one of the main EMT promoting TFs – expression by direct interaction163
with its promoter region [29].164
p16 represents the activity of the INK4b–ARF–INK4a locus, which encodes for the proteins p16 and165
p14. Cellular senescence is molecularly characterized by the expression of the proteins p16 and166
p53 [30]. p16 indirectly inhibits E2F by inhibiting cyclins CDK 2,4 and 6, which in turn inhibit167
Rb [31, 32]. On the other hand, the INK4b–ARF–INK4a – and thus p16 – is regulated by the168
activity of Polycomb–group proteins by means of promoter hypermethylation [33].169
p53 represents the protein with the same name. The shortage of telomeric DNA seems to be recognized170
as DNA damage promoting the activation of p53. In senescence, the activity of p16 and p53 over171
Rb, E2F and Cyclins invariably arrests the cell–cycle in the phase G1/G2 [34,35].172
Rb represents the cell–cycle regulator with the same name. Rb prevents cycle progression by forming a173
complex with the TF E2F [36].174
E2F represents the TF with the same name. E2F regulates critical genes for adequate cell–cycle pro-175
gression.176
Cyclin represents the activity of the complex Cyclin–dependent kinases (CDKs) known to inactivate Rb177
by phosphorylation. The latter, in turn, promotes the activity of E2F and cell–cycle progression [37].178
5
NF-κB represents cellular inflammation by the activity of the TF NF–κB. Accordingly, with this node179
we also represent the effect of the cytokines transforming growth factor–beta (TGF–β), interleukin–180
6 (IL–6), and tumor necrosis factor alpha (TNF–α). These three factors converge in the activation181
of NF–κB by phosphorylating the inhibitor IκB [38,39].182
TELasa represents the enzyme telomerase. This enzyme is responsible for the de novo synthesis of telom-183
eres. Most human cell–types do not express telomerase; however, it is expressed on immortalized184
epithelial cells, and it is thought to be responsible for telomere extension in tumors [40].185
Snai2 this node includes the activity of the main TFs known to be directly associated with EMT regu-186
lation, namely: Snai2 (Slug), Snail, Twist1, Twist2, ZEB1, ZEB2, and FOXC2. These TFs repress187
(induce) the expression of genes specific to epithelial (mesenchymal) cells [41, 42]. It has been188
proposed that there is a regulatory hierarchy driving EMT in which Snail activates Snai2, Twist,189
Zeb, and FOXC2. The latter, in turn, regulates Snail and Snai2 in a positive manner [41, 43–45].190
Regardless of a hierarchical interpretation, it is well–documented that these TFs maintain the191
mesenchymal phenotype in a coordinated fashion, showing co–expression patterns and regulatory192
crosstalk [44, 45]. It has been suggested that among these TFs, Snai2 may be the strongest sup-193
pressor of the epithelial phenotype [46]. However, we decided to represent the collective regulatory194
activity of the mesenchymal TFs using Snai2 based on the recent experimental demonstration of195
an antagonistic relation between Snai2 and ESE–2. Specifically, in vitro and in vivo studies showed196
that ESE–2 regulates the transcription of Snai2 [29].197
According to our model reduction methodology, literature search, careful manual curation, and198
network–based enrichment analysis; we propose that the derived core GRN module (see Figure 2) in-199
cludes a molecular set which is both necessary and sufficient to specify the identity of the aforementioned200
cellular phenotypes and to represent the main intracellular regulatory events driving spontaneous im-201
mortalization in a robust manner. We test our proposal by building and analyzing a mechanistic GRN202
dynamical model (see below).203
Recovered Attractors of the Core GRN Module Correspond to Configurations204
that Characterize Expected Cellular Phenotypes205
Based on the experimental data concerning the expression patterns of the genes incorporated in the pro-206
posed core GRN model in Figure 2 we assembled a table with a Boolean format of the state configurations207
expected to be recovered with the proposed GRN dynamical model. We refer to this configurations as the208
“expected attractors” – these correspond to the empirically observed genetic configurations. Furthermore,209
we integrated and formalized the experimental data concerning the interactions among the GRN nodes210
using Boolean logical functions that will rule the Boolean GRN dynamics and comprise the architecture211
of the proposed GRN. The set of formulated rules underlying the regulatory events is shown in Text S1212
– each logical rule is presented both as a logical preposition and as a truth table. Using the set of nodes213
and their corresponding logical rules we completely define a mechanistic dynamical GRN model [47]. The214
exhaustive computer–based simulation analysis of this model (see Methods) recovered three fixed–point215
attractors. Interestingly, the recovered attractors showed perfect correspondence with the expected at-216
tractors representing cellular phenotypes (see Table 2). The three recovered attractors correspond to the217
expected epithelial, senescent, and mesenchymal stem–like phenotypes :218
The normal epithelial cell phenotype is represented by the attractor with ESE–2, E2F, Cyclin and219
NF–κB activity. ESE–2 is an epithelial–specific TF which regulates a large number of genes specific to220
epithelial cells [48, 49]. NF–κB shows ubiquitous expression through the different types of human cells;221
however, it is also positively regulated by TFs of the ESE family (i.e. ESE–2) [50]. Moreover, under222
6
inflammatory conditions the activity of NF–κB is enhanced [51,52]. On the other hand, E2F and Cyclin223
represent core regulators of cell–cycle entrance, and thus specify proliferative capability [53, 54].224
The senescent cell phenotype is represented by the attractor with ESE–2, Rb, p16, p53, and NF–κB225
activity. Its biological counterpart would be an epithelial cell induced to replicative senescence, given226
(1) that it is expected to repress E2F [48]; and (2) that Rb, p16, p53, and NF–κB are the molecular227
biomarkers of cellular senescence [55].228
Messenchymal Stem-like phenotype In the model proposed here, the attractor whose configuration229
shows Snai2, Cyclin, NF–κB, and Telomerase activity – and inactivity of ESE–2, p16, Rb, p53, and E2F230
– would correspond to a mesenchymal stem–like phenotype with tumorigenic potential (see discussion231
below).232
233
The correspondence between the recovered attractors and the expected cellular phenotypes strongly234
suggests that the proposed nine–node core GRN indeed constitutes a regulatory module that is robust235
to initial conditions and that comprises a set of necessary and sufficient components and interactions to236
restrict the system to converge to the cellular phenotypes observed during spontaneous immortalization.237
Validation of the Uncovered Core Regulatory Module: Loss and gain–of–238
function Mutant and Robustness Analyses239
In order to validate the Boolean GRN dynamics we tested if the same GRN module is able to recover240
observed attractors in loss and gain of function mutants. We simulated such mutants analogous to ex-241
perimental observations reported in the literature. Specifically, we simulated loss– and gain–of–function242
mutations of ESE–2, Snai2, and p16 that have been reported in the literature. When simulating ESE–2243
gain of function (by setting the expression state for this node permanently to “1” in the simulations), the244
GRN model recovers three attractors corresponding to three different phenotypes which have been exper-245
imentally described and are associated with ESE–2 over–expression: an epithelial senescent cell [56], a246
normal epithelial cell [29], and a metastable state with proliferative phenotype [57]. In the case of ESE–2247
loss–of–function (simulated by setting the expression state of this node to “0” permanently), the model248
recovers an attractor corresponding to a mesenchymal phenotype, which is also consistent with observa-249
tions [29]. For Snai2, gain–of–function simulation recovers one attractor corresponding to mesenchymal250
stem–like phenotype, which is consistent with observations from ectopic over–expression experiments251
of mesenchymal TFs [18, 58, 59]. Snai2 loss–of–function simulation, on the other hand, recovered two252
attractors corresponding to normal and senescent epithelial phenotypes, which is also consistent with253
observations [29,60]. Finally, gain–of–function simulation of p16 recovered two attractors; one associated254
with a mesenchymal stem–like but incompletely senescent (due to the lack of p53) phenotype; the other255
corresponding to an epithelial senescent phenotype. The first prediction is consistent with the status of256
immortal and apoptosis–resistant shown by mesenchymal stem–like cells, as well as with the capability257
of mesenchymal TFs to abrogate senescence [61]. The second attractor is consistent with the potential258
for replicative senescence of epithelial cells. p16 loss–of–function simulation recovers two attractors cor-259
responding to an epithelial cell and a mesenchymal stem–like cell. This prediction is consistent with260
the observed biological conditions for both phenotypes, where p16 is commonly repressed by polycomb261
proteins [62]. The recovered attractors in mutant conditions are shown in Figure S2 in supplementary262
file.263
It is important to note that, given that the uncovered regulatory module uncovered here is the result264
of a model reduction methodology where we permissively chose to represent multiple molecular species265
by the activity of some of the nodes, a direct interpretation of mutant simulations is not straightforward.266
Consequently, care should be taken when interpreting the results of the simulations or making predictions267
7
of mutant phenotypes yet to be experimentally tested and further explored in the context of the larger268
GRN in Figure 1, which is the focus of an ongoing study. With this in mind, instead of simulating269
additional mutant conditions, we further validated the dynamical GRN model by testing its robustness270
to perturbations of the logical rules. Specifically, we tested the robustness of the predicted attractors by271
generating a large set of perturbed networks (e.g, 10,000), calculating their respective attractors, and then272
counting the occurrences of the original attractors within the perturbed set. We generated each perturbed273
network by choosing a function of the network at random and flipping a single bit in this function [63].274
We performed four complementary in silico based experiments following this general robustness analysis.275
First, we estimated the fraction of occurrences of the three original attractors (i.e., their robustness).276
Then, we repeated the experiment three times, but each time estimating the robustness of each individual277
attractor. For these four experiments we estimated a robustness (i.e., fraction of times) of 0.7439, 0.905,278
0.923, and 0.902, respectively. Hence, out of 10,000 random networks generated by in silico perturbations279
to the logical rules, a major fraction recovered the original attractors; as it is expected for a developmental280
(core) regulatory module that is robust both to transient (initial) and genetic perturbations [10]. This281
result supports the view that the core GRN uncovered here is indeed a regulatory network module driving282
developmental dynamics. It also constitutes a mechanistic explanation (for definitions, see [47]) to the283
generic cell phenotypes observed during spontaneous immortalization in vitro and which correlate with284
the cellular description of carcinoma progression in vivo (see below).285
Attractor Time–Ordered Transitions: Epigenetic Landscape of the Uncovered286
GRN Core Module287
During the tumorigenic transformation of epithelial cells in culture, a generic time–ordered series of cell288
state transitions is observed and robustly induced by inflammation [3,18]. Normal epithelial cell become289
senescent cells, which afterwards overcome this latter state acquiring a final mesenchymal stem–like290
phenotype. Interestingly, during the progressive pathological description of epithelial carcinomas in vivo291
the temporal pattern with which each of these different cell phenotypes enriches the tissue seems to be292
tightly ordered and is also generic to all types of such cancers irrespective of the tissue where they first293
appear. In order to test if the uncovered GRN core module not only underlies and restricts the types of cell294
phenotypes (attractors) but also their time–ordered transitions, following [25] we explored its associated295
Epigenetic Landscape (EL) by implementing a discrete stochastic model as an extension to the Boolean296
network model [12] (see Methods). By means of computer–based simulations we performed two analyses297
in order to uncover functional and structural constraints in attractor transitions. (1) We explored the298
temporal sequence of attractor attainment, and (2) we calculated the consistent global ordering of all the299
given attractors. Specifically, following [24], we found that the most probable temporal order of attractor300
attainment for a cell (population) initially on epithelial state is:301
Epithelial→ Senescent→ Mesenchymal stem–like,
see Fig 4a. On the other hand, following [64] we defined a consistent global ordering of the uncovered302
attractors based on their relative stability (see Methods). Relative stability calculations are based on the303
mean first passage time (MFPT) between pairs of attractors. These, in turn, epitomize barrier heights in304
the EL by approximating a measure for the ease of specific transitions. Similar to the previous analysis,305
the uncovered global ordering of attractors is Epithelial→ Senescent→ Mesenchymal stem–like (Fig 4b).306
This corresponds to the only order in which the system can visit the three attractors following a positive307
net transition rate. These results indicate that, when considering only intracellular regulatory constraints308
alone, the uncovered GRN core module structures the epigenetic landscape in a way that a specific flow309
across the landscape is preferentially and robustly followed. We anticipate that observed transition rates310
in vivo are likely to depend on tissue–level processes and/or additional GRN components underlying311
epithelial cell sub–differentiation, that have not been considered here. These latter restrictions will be312
modeled in future contributions building up on the framework that has been put forward here.313
8
Discussion314
Multicellularity by definition implies a one–to–many genotype–phenotype map. The genome of a mul-315
ticellular individual possesses the intrinsic potentiality to implement a developmental process by which316
all its different cell–types and tissue structures are ultimately established. In the last decades, a quite317
coherent theory to explain the development of multicellular organisms as the result of the orchestrating318
role of GRNs has been developed [9, 11, 12]. The main conclusion is that observable cell states emerge319
from the self–consistent multistable regulatory logic dictated by genome structure and obeyed by (mainly)320
transcription factors (TFs) resulting in stable, steady–states of gene expression. Cancer development and321
progression is also a phenomenon intrinsic to multicellular organisms. Furthermore, similar to normal322
development, cancer is robustly established as evidenced by its directionality and phenotypic conver-323
gence [2]. Is cancer somehow orchestrated by GRN dynamics as well? Several hypothesis have been324
presented in this direction such as the cancer attractor theory [2,8], and the endogenous molecular cellu-325
lar network hypothesis [65,66]. In this contribution we also follow the viewpoint of an intrinsic regulatory326
network, but we focus on a specific developmental process at the cellular level: the robust cell state327
transitions observed during the tumorigenic transformation of human epithelial cells in culture induced328
by inflammation and resulting from surpassing a senescent state through EMT – i.e., tumorigenic trans-329
formation due to spontaneous immortalization. We propose that a mechanistic understanding of this330
process is an important first necessary step to unravel key cellular processes which might be occurring331
in vivo, where its rate of occurrence is likely to be regulated by tissue–level and systemic conditions332
directly linked with lifestyle choices, as well as additional regulatory interactions underlying epithelial333
cell sub–differentiation.334
A Generic Molecular Regulatory Network335
The predominant strategy in the molecular study of cancer and cellular tumorigenic transformation336
has been to focus on pathways and associated mutations. Aware that signaling pathways are actually337
embedded in complex regulatory networks here we assembled from curated literature a GRN comprising338
the main molecular regulators involved in key cellular processes ubiquitous to carcinogenesis following339
a bottom–up approach (see results). Subsequently, we followed a mechanistic approach to address the340
question of whether we assembled a set of necessary and sufficient molecular players and interactions341
to recover the cellular phenotypes and processes documented during the spontaneous immortalization of342
human epithelial cells in culture: we proposed, analyzed and validated an experimentally grounded core343
GRN dynamical model.344
Small developmental regulatory modules have been shown to successfully include the necessary and345
sufficient set of components and interactions for explaining, as manifestations of intrinsic structural346
and functional constraints imposed by these GRNs, the dynamics of complex processes such as stem347
cell differentiation [67], cell–fate decision [68] and similar cellular processes during plant morphogenesis348
[9,10,22,24]. We hypothesized that a similar core developmental module can be formulated in an attempt349
to explain the cell–fates observed during spontaneous immortalization of human epithelial cells in vitro350
resulting in a potentially tumorigenic state. In order to show this, we first reduced the proposed larger351
network into a regulatory core module, by eliminating transitory pathways within the network and by352
including compounded nodes while maintaining the core network structure and without affecting the353
dynamical output during each reduction step (for details, see Methods). We obtained a small set of main354
molecular players (Fig 2). We extracted from available literature the expression profiles of the generally355
observable cell states of interest in terms of this minimal set of molecules (see Table 2). Given our main356
hypothesis, we tested if the reduced molecular set and their regulatory logic formalized as a Boolean GRN357
model were able to recover the biologically observable expression profiles as stationary and stable network358
configurations (i.e., attractors). Interestingly, we found that the core GRN model only converges to the359
observed gene expression profiles in wild–type (see Table 2) and some mutant backgrounds (see results).360
9
This result strongly suggest that we have successfully included the key regulators and interactions at361
play during the establishment of cell states observed during the tumorigenic transformation of human362
epithelial cells resulting from spontaneous immortalization.363
It is noteworthy that our model does not include any hypothetical interaction or component, a com-364
mon practice in GRN modeling [10, 22, 68]. Our GRN model exclusively integrates available published365
experimental data; indeed, it was a surprising result that the observed dynamical behavior emerged natu-366
rally under such conditions. This suggests that despite incomplete information, there is enough molecular367
data to uncover important restrictions underlying cell behavior during transitions relevant to epithelial368
carcinogenesis. Consequently, we consider that the networks reported herein (both the large and the core369
GRNs) may serve as bona fide base models useful to integrate novel discoveries, as well as components370
underlying epithelial cellular sub–differentiation, while following a bottom–up approach in cancer network371
systems biology.372
Attractor Time–Ordered Transitions373
Discrete GRN models can be used to integrate regulatory mechanisms that not only recapitulate the374
observed gene expression patterns, but that also reproduce the observed developmental time–ordering of375
cell phenotypes. This can be done by considering stochasticity in the model in order to explore [12,23,25]376
and/or characterize [64] the associated EL. Importantly, by exploring noise–induced transitions we do not377
assume that noise alone is the driving force of the transitions, instead, we exploit noise as a tool to explore378
the GRN–based version of Waddington’s EL and to indirectly characterize its structure. Specifically, by379
calculating the relative stability of the attractors (see Methods) we approximate the in–between attractor380
barrier heights in the landscape. Furthermore, measures of relative stability can also be exploited to381
calculate net transition rates measuring the ease of specific inter–attractor transitions and to uncover382
the predominant developmental route across the epigenetic landscape [69]: ordered transitions sharing383
positive net transition rates will be preferentially followed. Our results show that such a developmental384
route follows the time–order of cellular phenotypic states epithelial→senescent→mesenchymal stem–like385
(potentially tumorigenic). In other words, the constraints imposed by the GRN structure the associated386
EL in such a way that an epithelial cell in culture as a “ball” would naturally roll following such a path,387
in agreement with the observed spontaneous immortalization process.388
Even in the case of the simple model presented here, it is interesting that of the many possible cell389
states and developmental routes, the core GRN network is canalized to the few steady–states and the390
developmental time–ordering consistent with the molecular characterization of cell phenotypes observed391
during spontaneous immortalization and correlating with carcinoma progression in vivo (see below).392
This suggests that specific progressive alterations or particular “abnormal” signaling mechanisms are not393
necessarily required for a cell to reach a potentially tumorigenic state. Additionally, robustness analysis394
performed on the same network showed that the recovered attractors are also robust to permanent395
alterations of the regulatory logic.396
From Abstract Network Attractors and Dynamics to Biological Insight397
We are aware of the high degree of simplification involved in the model proposed herein. Accordingly, we398
do not attempt to present it as a source of accurate predictions for either the occurrence or the future399
behavior of a phenomena as complex as carcinogenesis. Instead, we formulate the model in an attempt400
to provide some intuition into otherwise highly complicated processes, and to illuminate increasing body401
of confounding descriptions. Simple mechanistic models like the one presented here sacrifice detail and402
accuracy in exchange for understanding [47,70]. What biological insights can be gained by the uncovered403
GRN dynamical model? Our simple GRN model strongly suggests that the generic series of cell state404
transitions widely observed and robustly induced by inflammation in cell culture from normal epithelial405
to immortalized senescent cells, and from this latter state to a final mesenchymal stem–like phenotype406
10
in the process defined as spontaneous immortalization naturally result from the self–organized behavior407
emerging from an underlying GRN novel architecture and topology.408
Importantly, cells that emerge from spontaneous immortalization induced by cytokines display mes-409
enchymal stem like phenotype and tumorigenic behavior – i.e., repress proteins p16 and p53, surpass410
senescence, and re–express telomerase [18]. Phenotypically, these cells are difficult to distinguish from411
the so–called cancer stem cells, tumor initiating cells or embryonic stem cells [20, 21]; are resistant to412
apoptosis; and have the ability to migrate and generate metastasis and form secondary tumors – all413
lethal traits characterizing cancer cells [3]. We, thus, speculate that tissue–level conditions associated414
with a bad prognosis, such as a pro–inflammatory milieu, may increase the rate of occurrence of these415
same transitions in vivo promoting as a result the development and progression of epithelial cancer. We416
substantiate this view by noting several independent empirical observations. (1) Histological diagnosis of417
carcinoma are generally preceded by a lesion called hyperplasia; senescent cells are abundant in hyper-418
plasias and scarce in carcinomas [71]. (2) During chronological aging senescent cells increase in number419
within both normal tissues and hyperplasias. (3) Senescence is associated with the promotion of carcino-420
genesis by contributing with the loss of tissue architecture and promoting an inflammatory milieu [72].421
(3) Overcoming the senescent barrier is fundamental in tumor progression [73,74]. (4) The EMT process422
constitutes a well–characterized mean to overcome senescence under an inflammatory environment( [75]).423
We must point out, however, that transition rates during spontaneous immortalization, if occurring424
in vivo, may be regulated by tissue–level, self–organizational processes not considered in our cellular425
level model. For example, the likelihood of spontaneous immortalization in vivo may be increased by426
extracellular perturbations that inevitably occur during aging; mainly, by inflammation and tissue re-427
modeling resulting from an increased population of senescent cells. The cellular level network models428
reported here are, nevertheless, a valuable building block for more detailed multi–level models integrating429
further sources of tissue–level constraints such as cell cycle progression, cell–cell interactions, differential430
proliferation rates, and mechanical forces.431
Summarizing, in this contribution we propose an experimentally grounded GRN model for sponta-432
neous immortalization. We report one large GRN model (41 nodes) and one core GRN developmental433
module (9 nodes), both useful and necessary for further integration of signaling and mechanical pro-434
cesses in multi–level, more detailed modeling efforts. We explore by analyzing the dynamical behavior of435
the latter if the uncovered GRN topology and architecture underlies the gene expression configurations436
that characterize normal epithelial, senescent, and mesenchymal stem–like cell–fates well documented437
during tumorigenic transformation in vitro and which correlate with those observed in the progressive438
pathological description of epithelial carcinogenesis in vivo. Overall, our results suggest that tumorigenic439
transformation in vitro due to spontaneous immortalization can be understood and modeled at a cellular440
level generically as a developmental system undergoing cell–state transitions resulting from the structural441
and functional constraints imposed, in part, by the interactions included in the proposed GRN. They442
also suggest that similar transitions may be occurring in vivo and might be relevant for carcinoma devel-443
opment and progression. This view is consistent with the robustness, generic patterns, and directionality444
observed during the development of human cancers derived from epithelial tissues. Particularly, based445
on our results, we hypothesize that replicative senescence and chronic inflammation are likely to increase446
the occurrence of spontaneous immortalization in vivo promoting the development of epithelial carcino-447
genesis. Testing such hypothesis awaits the development of multi–level models taking the ones presented448
here as building blocks, and is the subject of ongoing investigation.449
11
Materials and Methods450
Literature Search451
A total of 159 references, considering both references in extended view material (see Text S1) and main452
text, were carefully and manually reviewed in order to first define a minimal set of cellular phenotypes and453
processes (for definitions, see Text S1) which enable a generic representation of epithelial carcinogenesis454
on the basis of cell state transition events. Subsequently, a set of associated, experimentally described455
molecular regulators was extracted from the literature, including their regulatory interactions.456
Network Assembly457
The network (see Fig. 1) was assembled manually by adding nodes (genes/proteins) and edges (activating458
or inhibitory interactions) describing direct mechanisms reported in the available literature to have an459
influence on both the specification of the cellular phenotypes and the development of the cellular process460
defined in (Text S1). The initial network was created based on experimentally grounded knowledge461
from 159 references (including reviews and research papers) and consists of 41 nodes and 97 edges. The462
literature included data known before 2014. Support for each of the proposed interactions is listed in463
Text S1.464
Network–based Gene Set Enrichment Analysis465
The bioinformatics tools EnrichNet [76] and TopoGSA [77] were used to perform network–based gene466
set enrichment analysis and topology–based gene set analysis, respectively. Briefly, EnrichNet maps467
the input gene set into a molecular interaction network and calculates distances between the genes and468
pathways/processes in a reference database. TopoGSA also maps the input gene set into a network, and469
then it computes its topological statistics and compares it against the topology of pathways/processes in470
a reference database. Here a connected human interactome graph extracted from the STRING database471
and the KEGG and GO Biological Process databases were used as reference molecular interaction network472
and databases. Both analyses were performed using the Cytoscape plugin Jepettp [78].473
Network Reduction474
In order to extract a representative core regulatory model from the initial network and to obtain a475
more computationally tractable one, which reasonably unfolds the regulatory pathways, a reduction476
methodology was followed based on certain simplifying assumptions – supported by previous results in477
molecular biology studies – and on mathematical results from dynamical systems and graph theory. Here478
we briefly describe the main steps. The step–by–step reduction process is included in Text S1.479
Simplifying assumptions:480
• ESE-2 groups activities of ESE-1, ESE-3, EGF, Her-2/neu.481
• Snai2 groups activities of Snail, Twist (Twist, in turn, groups activities of Twist1 and Twist2), Zeb482
and FOXC2.483
• p16 groups p14 and NF–κB node groups the inflammatory response activated by growth factors,484
mitogens and cytokines.485
12
Reduction process: (1) Simple mediator nodes (i.e., those nodes with in–degree and out–degree of486
one) were removed iteratively. (2) Nodes with in–degree of one and out–degree greater than one were487
removed iteratively. These steps (1 and 2) does not alter the attractors of the Boolean network under488
the asynchronous update, as mathematically proved in [79]. (3) Redundant interactions of selected nodes489
(based on biological arguments) resulting in self–regulation were included in single nodes/interactions490
(for details, see Text S1). (4) Selected nodes (based on biological knowledge again) with in–degree491
greater than one and out–degree of one were removed. The final steps (3 and 4) are supported by the492
mathematical analysis made in [80] in which the authors prove that the methodology preserves relevant493
topological and dynamical properties.494
It is noteworthy that fixed point attractors are time–independent, so they are the same in both495
synchronous and asynchronous update methods. Complex attractors (in which the system oscillates496
among a set of states), on the other hand, depend on the update method. Consistently, the update497
method used in the model is irrelevant for the obtained results. This last assertion is valid because498
the model shows only fixed point attractors, which means, under the mathematically proved reduction499
methods applied, that the large network describes a qualitative long time behavior conserved in the reduced500
one. Besides, the methodology applied in order to obtain the reduced network enables the analysis of a501
resulting regulatory graph which is biologically meaningful and dynamically consistent with the network502
constructed with available molecular biology experimental data.503
The final reduced network is shown in Figure 2. We refer to this network and its corresponding logical504
rules as the core regulatory module.505
Dynamical Gene Regulatory Network Model506
A Boolean network models a dynamical system assuming both discrete time and discrete state variables.507
This is expressed formally with the mapping:508
xi(t+ 1) = Fi(x1(t), x2(t), ..., xk(t)), (1)
where the set of functions Fi are logical prepositions (or truth tables) expressing the relationship between509
the genes that share regulatory interactions with the gene i, and where the state variables xi(t) can510
take the discrete values 1 or 0 indicating whether the gene i is expressed or not at a certain time t,511
respectively. An experimentally grounded Boolean GRN model is then completely specified by the set512
of genes proposed to be involved in the process of interest and the associated set of logical functions513
derived from experimental data [23]. The set of logical functions for the core regulatory module used in514
this study is included in Text S1 – both as logical prepositions and truth tables. The dynamical analysis515
of the Boolean network model was conducted using the package BoolNet [63] within the R statistical516
programming environment (www.R-project.org).517
Epigenetic Landscape Exploration518
Including Stochasticity519
In order to extend the Boolean Network into a discrete stochastic model and then study the properties520
of its associated EL, the so–called stochasticity in nodes (SIN) model was implemented following [23–25].521
In this model, a constant probability of error ξ is introduced for the deterministic Boolean functions. In522
other words, at each time step, each gene “disobeys” its Boolean function with probability ξ. Formally:523
Pxi(t+1)[Fi(xregi(t))] = 1− ξ,
Pxi(t+1)[1− Fi(xregi(t))] = ξ.
(2)
The probability that the value of the now random variable xi(t+1) is determined or not by its associated524
logical function Fi(xregi(t)) is 1− ξ or ξ, respectively.525
13
Attractor Transition Probability Estimation526
An attractor transition probability matrix Π with components:527
πij = P (At+1 = j|At = i), (3)
representing the probability that an attractor j is reached from an attractor i, was estimated by numerical528
simulation following [24]. Specifically, for each network state i in the state space (2n) a stochastic one–529
step transition was simulated a large number of times (≈ 10, 000). The probability of transition from an530
attractor i to an attractor j was then estimated as the frequency of times the states belonging to the531
basin of the attractor i were stochastically mapped into a state within the basin of the attractor j.532
Following the discrete time Markov chains (DTMCs) [81] theoretical framework, the estimated tran-533
sition probability matrix was integrated into a dynamic equation for the probability distribution:534
PA(t+ 1) = ΠPA(t), (4)
where PA(t) is the probability distribution over the attractors at time t, and Π is the transition probability535
matrix. This equation was iterated to simulate the temporal evolution of the probability distribution over536
the attractors starting from a specific initial probability distribution.537
Attractor Relative Stability and Global Ordering Analyses538
In addition to the calculation of the most probable temporal cell–fate pattern (see [24]), a discrete539
stochastic GRN model enables the study of the ease for transitioning from one attractor to another [69].540
Specifically, a transition barrier in the EL epitomizes the ease for transitioning from one attractor to541
another. The ease of transitions, in turn, offers a notion of relative stability. It has recently been proposed542
that the GRN has a consistent global ordering of all cell attractors and intermediate transient states which543
can be uncovered by measuring the relative stabilities of all the attractors of a Boolean GRN [64, 69].544
Here, the relative stabilities of the cell states were defined based on the mean first passage time (MFPT).545
Specifically, a relative stability matrix M was calculated which reflects the transition barrier between546
any two states based on the MFPT. Here, in all cases, the MFPT was estimated numerically. Using the547
transition probabilities among attractors, a large number sample paths of a finite Markov chain were548
simulated. The MFPT from attractor i to attractor j corresponds to the averaged value of the number549
of steps taken to visit attractor j for the fist time, given that the entire probability mass was initially550
localized at the attractor i. The average is taken over the realizations. Following [69], based on the551
MFPT values a net transition rate between attractor i and j can be defined as follows:552
di,j =
1
MFPTi,j
−
1
MFPTj,i
(5)
This quantity effectively measures the ease of transition as a net probability flow. For all the calculation553
involving stochasticity, the robustness of the results was assessed by taking three different values for554
the probability of error (0.01, 0.05, 0.1). Stability of the results was assessed by manually changing the555
number of simulated samples until results become stable.556
The consistent global ordering of all attractors uncovered with the core GRN was defined based on the557
formula proposed in [64]. Briefly, the consistent global ordering of the attractors is given by the attractor558
permutation in which al transitory net transition rates from an initial attractor to a final attractor are559
positive. This is schematically represented in Figure 4b. Calculated transition probability, MFPT, and560
net transition rate matrices are included in Text S2. R source code implementing all the calculations and561
analyses is available upon request.562
14
Authors’ contributions563
ERAB and JMG coordinated the study and with the other authors established the overall logic and564
core questions to be addressed. All the authors conceived and planned the modeling approaches. FML565
recovered the information from the literature to establish the model and provided expert knowledge in566
cancer biology. JDV established many of the specific analyses to be done, and programmed and ran all567
the modeling and analyses. FML and CEO formalized experimental data into regulatory logic. ERAB,568
JMG, JDV and FML participated in the interpretation of the results and analyses. JDV wrote most of569
the paper with help from ERAB and JMG and input from FML. All authors proofread the final version570
of the ms submitted.571
Acknowledgments572
This work was supported by grants CONACYT 240180, 180380, 167705, 152649 and UNAM-DGAPA-573
PAPIIT: IN203113, IN 203214, IN203814, UC Mexus ECO-IE415. J.D.V acknowledges the support574
of CONACYT and the Centre for Genomic Regulation (CRG), Barcelona, Spain; while spending a575
research visit in the lab of Stephan Ossowski. This article constitutes a partial fulfillment of the graduate576
program Doctorado en Ciencias Biomédicas of the Universidad Nacional Autónoma de México, UNAM577
in which J.D.V. developed this project. J.D.V receives a PhD scholarship from CONACYT. The authors578
acknowledge logistical and administrative help of Diana Romo.579
References580
1. Anand P, Kunnumakara AB, Sundaram C, Harikumar KB, Tharakan ST, Lai OS, et al. Can-581
cer is a preventable disease that requires major lifestyle changes. Pharmaceutical research.582
2008;25(9):2097–2116.583
2. Huang S. On the intrinsic inevitability of cancer: from foetal to fatal attraction. In: Seminars in584
cancer biology. vol. 21. Elsevier; 2011. p. 183–199.585
3. Mani SA, Guo W, Liao MJ, Eaton EN, Ayyanan A, Zhou AY, et al. The epithelial-mesenchymal586
transition generates cells with properties of stem cells. Cell. 2008;133(4):704–715.587
4. Huang S. Non-genetic heterogeneity of cells in development: more than just noise. Development.588
2009;136(23):3853–3862.589
5. Ben-Porath I, Thomson MW, Carey VJ, Ge R, Bell GW, Regev A, et al. An embryonic stem cell–590
like gene expression signature in poorly differentiated aggressive human tumors. Nature genetics.591
2008;40(5):499–507.592
6. Kelloff GJ, Sigman CC. Assessing intraepithelial neoplasia and drug safety in cancer-preventive593
drug development. Nature Reviews Cancer. 2007;7(7):508–518.594
7. Virchow RLK. Cellular pathology. John Churchill; 1860.595
8. Huang S, Ernberg I, Kauffman S. Cancer attractors: a systems view of tumors from a gene network596
dynamics and developmental perspective. In: Seminars in cell & developmental biology. vol. 20.597
Elsevier; 2009. p. 869–876.598
9. Mendoza L, Alvarez-Buylla ER. Dynamics of the genetic regulatory network for Arabidopsis599
thaliana flower morphogenesis. J Theor Biol. 1998;193(2):307–319.600
15
10. Espinosa-Soto C, Padilla-Longoria P, Alvarez-Buylla ER. A gene regulatory network model for601
cell-fate determination during Arabidopsis thaliana flower development that is robust and recovers602
experimental gene expression profiles. The Plant Cell Online. 2004;16(11):2923–2939.603
11. Huang S, Kauffman S. Complex gene regulatory networks-from structure to biological observables:604
cell fate determination. Encyclopedia of Complexity and Systems Science Meyers RA, editors605
Springer. 2009;p. 1180–1293.606
12. Alvarez-Buylla ER, Azpeitia E, Barrio R, Beńıtez M, Padilla-Longoria P. From ABC genes to607
regulatory networks, epigenetic landscapes and flower morphogenesis: making biological sense of608
theoretical approaches. Seminars in cell & developmental biology. 2010;21(1):108–117.609
13. Huang S. Reprogramming cell fates: reconciling rarity with robustness. Bioessays. 2009;31(5):546–610
560.611
14. Kaneko K. Characterization of stem cells and cancer cells on the basis of gene expression profile612
stability, plasticity, and robustness. Bioessays. 2011;33(6):403–413.613
15. Huang S. The molecular and mathematical basis of Waddington’s epigenetic landscape: A frame-614
work for post-Darwinian biology? Bioessays. 2012;34(2):149–157.615
16. Davila-Velderrain J, Alvarez-Buylla ER. Bridging the Genotype and the Phenotype: Towards An616
Epigenetic Landscape Approach to Evolutionary Systems Biology. bioRxiv. 2014;.617
17. Xu J, Lamouille S, Derynck R. TGF-β-induced epithelial to mesenchymal transition. Cell research.618
2009;19(2):156–172.619
18. Battula VL, Evans KW, Hollier BG, Shi Y, Marini FC, Ayyanan A, et al. Epithelial-Mesenchymal620
Transition-Derived Cells Exhibit Multilineage Differentiation Potential Similar to Mesenchymal621
Stem Cells. Stem Cells. 2010;28(8):1435–1445.622
19. Li CW, Xia W, Huo L, Lim SO, Wu Y, Hsu JL, et al. Epithelial–mesenchymal transition induced623
by TNF-α requires NF-κB–mediated transcriptional upregulation of Twist1. Cancer research.624
2012;72(5):1290–1300.625
20. Morel AP, Lièvre M, Thomas C, Hinkal G, Ansieau S, Puisieux A. Generation of breast cancer626
stem cells through epithelial-mesenchymal transition. PloS one. 2008;3(8):e2888.627
21. Neph S, Stergachis AB, Reynolds A, Sandstrom R, Borenstein E, Stamatoyannopoulos JA. Cir-628
cuitry and dynamics of human transcription factor regulatory networks. Cell. 2012;150(6):1274–629
1286.630
22. Azpeitia E, Beńıtez M, Vega I, Villarreal C, Alvarez-Buylla ER. Single-cell and coupled GRN631
models of cell patterning in the Arabidopsis thaliana root stem cell niche. BMC systems biology.632
2010;4(1):134.633
23. Azpeitia E, Davila-Velderrain J, Villarreal C, Alvarez-Buylla ER. Gene regulatory network models634
for floral organ determination. In: Flower Development. Springer; 2014. p. 441–469.635
24. Álvarez-Buylla ER, Chaos Á, Aldana M, Beńıtez M, Cortes-Poza Y, Espinosa-Soto C, et al. Flo-636
ral morphogenesis: stochastic explorations of a gene network epigenetic landscape. Plos one.637
2008;3(11):e3626.638
25. Davila-Velderrain J, Mart́ınez-Garćıa J, Alvarez-Buylla ER. Modeling the Epigenetic Attractors639
Landscape: Towards a Post-Genomic Mechanistic Understanding of Development. Name: Frontiers640
in Genetics. 2015;6:160.641
16
26. Zhou J, Ng A, Tymms MJ, Jermiin LS, Seth AK, Thomas RS, et al. A novel transcription642
factor, ELF5, belongs to the ELF subfamily of ETS genes and maps to human chromosome643
11p13-15, a region subject to LOH and rearrangement in human carcinoma cell lines. Oncogene.644
1998;17(21):2719–2732.645
27. Ma XJ, Salunga R, Tuggle JT, Gaudet J, Enright E, McQuary P, et al. Gene expression pro-646
files of human breast cancer progression. Proceedings of the National Academy of Sciences.647
2003;100(10):5974–5979.648
28. Escamilla-Hernandez R, Chakrabarti R, Romano RA, Smalley K, Zhu Q, Lai W, et al. Genome-649
wide search identifies Ccnd2 as a direct transcriptional target of Elf5 in mouse mammary gland.650
BMC molecular biology. 2010;11(1):68.651
29. Chakrabarti R, Hwang J, Blanco MA, Wei Y, Lukačǐsin M, Romano RA, et al. Elf5 inhibits the652
epithelial–mesenchymal transition in mammary gland development and breast cancer metastasis653
by transcriptionally repressing Snail2. Nature cell biology. 2012;14(11):1212–1222.654
30. Vernier M, Bourdeau V, Gaumont-Leclerc MF, Moiseeva O, Bégin V, Saad F, et al. Regulation of655
E2Fs and senescence by PML nuclear bodies. Genes & development. 2011;25(1):41–50.656
31. McConnell BB, Gregory FJ, Stott FJ, Hara E, Peters G. Induced expression of p16 INK4a in-657
hibits both CDK4-and CDK2-associated kinase activity by reassortment of cyclin-CDK-inhibitor658
complexes. Molecular and cellular biology. 1999;19(3):1981–1989.659
32. Villacañas Ó, Pérez JJ, Rubio-Mart́ınez J. Structural analysis of the inhibition of Cdk4 and Cdk6660
by p16INK4a through molecular dynamics simulations. Journal of Biomolecular Structure and661
Dynamics. 2002;20(3):347–358.662
33. Bracken AP, Kleine-Kohlbrecher D, Dietrich N, Pasini D, Gargiulo G, Beekman C, et al. The663
Polycomb group proteins bind throughout the INK4A-ARF locus and are disassociated in senescent664
cells. Genes & development. 2007;21(5):525–530.665
34. Fang L, Igarashi M, Leung J, Sugrue MM, Lee SW, Aaronson SA. p21Waf1/Cip1/Sdi1 induces666
permanent growth arrest with markers of replicative senescence in human tumor cells lacking667
functional p53. Oncogene. 1999;18(18):2789–2797.668
35. Mao Z, Ke Z, Gorbunova V, Seluanov A. Replicatively senescent cells are arrested in G1 and G2669
phases. Aging (Albany NY). 2012;4(6):431.670
36. Chellappan SP, Hiebert S, Mudryj M, Horowitz JM, Nevins JR. The E2F transcription factor is a671
cellular target for the RB protein. Cell. 1991;65(6):1053–1061.672
37. Byeon IJL, Li J, Ericson K, Selby TL, Tevelev A, Kim HJ, et al. Tumor Suppressor p16INK4A:673
Determination of Solution Structure and Analyses of Its Interaction with Cyclin-Dependent Kinase674
4. Molecular cell. 1998;1(3):421–431.675
38. Beauséjour CM, Krtolica A, Galimi F, Narita M, Lowe SW, Yaswen P, et al. Reversal of human676
cellular senescence: roles of the p53 and p16 pathways. The EMBO journal. 2003;22(16):4212–4222.677
39. Freudlsperger C, Bian Y, Wise SC, Burnett J, Coupar J, Yang X, et al. TGF-β and NF-κB signal678
pathway cross-talk is mediated through TAK1 and SMAD7 in a subset of head and neck cancers.679
Oncogene. 2012;32(12):1549–1559.680
40. Harley C, Futcher A, Greider C. Telomeres shorten during ageing of human fibroblasts. Nature.681
1990;345(6274):458–460.682
17
41. Mani SA, Yang J, Brooks M, Schwaninger G, Zhou A, Miura N, et al. Mesenchyme Forkhead 1683
(FOXC2) plays a key role in metastasis and is associated with aggressive basal-like breast cancers.684
Proceedings of the National Academy of Sciences. 2007;104(24):10069–10074.685
42. Zeisberg M, Neilson EG, et al. Biomarkers for epithelial-mesenchymal transitions. The Journal of686
clinical investigation. 2009;119(6):1429–1437.687
43. Bolós V, Peinado H, Pérez-Moreno MA, Fraga MF, Esteller M, Cano A. The transcription factor688
Slug represses E-cadherin expression and induces epithelial to mesenchymal transitions: a compar-689
ison with Snail and E47 repressors. Journal of cell science. 2003;116(3):499–511.690
44. Dave N, Guaita-Esteruelas S, Gutarra S, Frias À, Beltran M, Peiró S, et al. Functional cooperation691
between Snail1 and twist in the regulation of ZEB1 expression during epithelial to mesenchymal692
transition. Journal of Biological Chemistry. 2011;286(14):12024–12032.693
45. Casas E, Kim J, Bendesky A, Ohno-Machado L, Wolfe CJ, Yang J. Snail2 is an essential mediator of694
Twist1-induced epithelial mesenchymal transition and metastasis. Cancer research. 2011;71(1):245–695
254.696
46. Hajra KM, Chen DY, Fearon ER. The SLUG zinc-finger protein represses E-cadherin in breast697
cancer. Cancer research. 2002;62(6):1613–1618.698
47. Davila-Velderrain J, Martinez-Garcia J, Alvarez-Buylla E. Descriptive vs. Mechanistic Network699
Models in Plant Development in the Post-Genomic Era. Plant Functional Genomics: Methods and700
Protocols. 2015;p. 455–479.701
48. Siegel R, Naishadham D, Jemal A. Cancer statistics, 2013. CA: a cancer journal for clinicians.702
2013;63(1):11–30.703
49. Jemal A, Bray F, Center MM, Ferlay J, Ward E, Forman D. Global cancer statistics. CA: a cancer704
journal for clinicians. 2011;61(2):69–90.705
50. Stratton MR, Campbell PJ, Futreal PA. The cancer genome. Nature. 2009;458(7239):719–724.706
51. Hudson TJ, Anderson W, Aretz A, Barker AD, Bell C, Bernabé RR, et al. International network707
of cancer genome projects. Nature. 2010;464(7291):993–998.708
52. Weinstein JN, Collisson EA, Mills GB, Shaw KRM, Ozenberger BA, Ellrott K, et al. The cancer709
genome atlas pan-cancer analysis project. Nature genetics. 2013;45(10):1113–1120.710
53. Yaffe MB. The scientific drunk and the lamppost: massive sequencing efforts in cancer discovery711
and treatment. Science signaling. 2013;6(269):pe13.712
54. Creixell P, Schoof EM, Erler JT, Linding R. Navigating cancer network attractors for tumor-specific713
therapy. Nature biotechnology. 2012;30(9):842–848.714
55. DePinho RA, Polyak K. Cancer chromosomes in crisis. Nature genetics. 2004;36(9):932–934.715
56. Fujikawa M, Katagiri T, Tugores A, Nakamura Y, Ishikawa F. ESE-3, an Ets family transcription716
factor, is up-regulated in cellular senescence. Cancer science. 2007;98(9):1468–1475.717
57. Lee JM, Dedhar S, Kalluri R, Thompson EW. The epithelial–mesenchymal transition: new insights718
in signaling, development, and disease. The Journal of cell biology. 2006;172(7):973–981.719
18
58. Cano A, Pérez-Moreno MA, Rodrigo I, Locascio A, Blanco MJ, del Barrio MG, et al. The transcrip-720
tion factor snail controls epithelial–mesenchymal transitions by repressing E-cadherin expression.721
Nature cell biology. 2000;2(2):76–83.722
59. Sun Y, Song GD, Sun N, Chen JQ, Yang SS. Slug overexpression induces stemness and promotes723
hepatocellular carcinoma cell invasion and metastasis. Oncology Letters. 2014;7(6):1936–1940.724
60. Liu Y, El-Naggar S, Darling DS, Higashi Y, Dean DC. Zeb1 links epithelial-mesenchymal transition725
and cellular senescence. Development. 2008;135(3):579–588.726
61. Weinberg RA. Twisted epithelial–mesenchymal transition blocks senescence. Nature cell biology.727
2008;10(9):1021–1023.728
62. Kim WY, Sharpless NE. The Regulation of¡ i¿ INK4¡/i¿/¡ i¿ ARF¡/i¿ in Cancer and Aging. Cell.729
2006;127(2):265–275.730
63. Müssel C, Hopfensitz M, Kestler HA. BoolNet—an R package for generation, reconstruction and731
analysis of Boolean networks. Bioinformatics. 2010;26(10):1378–1380.732
64. Zhou JX, Samal A, d’Hèrouël AF, Price ND, Huang S. Relative Stability of Network States in733
Boolean Network Models of Gene Regulation in Development. arXiv preprint arXiv:14076117.734
2014;.735
65. Wang G, Zhu X, Gu J, Ao P. Quantitative implementation of the endogenous molecular–cellular736
network hypothesis in hepatocellular carcinoma. Interface focus. 2014;4(3):20130064.737
66. Zhu X, Yuan R, Hood L, Ao P. Endogenous molecular-cellular hierarchical modeling of prostate738
carcinogenesis uncovers robust structure. Progress in biophysics and molecular biology. 2015;.739
67. Li C, Wang J. Quantifying cell fate decisions for differentiation and reprogramming of a human stem740
cell network: landscape and biological paths. PLoS computational biology. 2013;9(8):e1003165.741
68. Zhou JX, Brusch L, Huang S. Predicting pancreas cell fate decisions and reprogramming with a742
hierarchical multi-attractor model. PloS one. 2011;6(3):e14752.743
69. Zhou JX, Qiu X, d’Herouel AF, Huang S. Discrete Gene Network Models for Understanding Multi-744
cellularity and Cell Reprogramming: From Network Structure to Attractor Landscapes Landscape.745
In: Computational Systems Biology Second Edition Elsevier. 2014;p. 241–276.746
70. Lander AD. The edges of understanding. BMC biology. 2010;8(1):40.747
71. Chen Z, Trotman LC, Shaffer D, Lin HK, Dotan ZA, Niki M, et al. Crucial role of p53-dependent748
cellular senescence in suppression of Pten-deficient tumorigenesis. Nature. 2005;436(7051):725–730.749
72. Campisi J. Cellular senescence: putting the paradoxes in perspective. Current opinion in genetics750
& development. 2011;21(1):107–112.751
73. Narita M, Lowe SW. Senescence comes of age. Nature medicine. 2005;11(9):920–922.752
74. Yildiz G, Arslan-Ergul A, Bagislar S, Konu O, Yuzugullu H, Gursoy-Yuzugullu O, et al. Genome-753
wide transcriptional reorganization associated with senescence-to-immortality switch during human754
hepatocellular carcinogenesis. PloS one. 2013;8(5):e64016.755
75. Smit MA, Peeper DS. Epithelial-mesenchymal transition and senescence: two cancer-related pro-756
cesses are crossing paths. Aging (Albany NY). 2010;2(10):735.757
19
76. Glaab E, Baudot A, Krasnogor N, Schneider R, Valencia A. EnrichNet: network-based gene set758
enrichment analysis. Bioinformatics. 2012;28(18):i451–i457.759
77. Glaab E, Baudot A, Krasnogor N, Valencia A. TopoGSA: network topological gene set analysis.760
Bioinformatics. 2010;26(9):1271–1272.761
78. Winterhalter C, Widera P, Krasnogor N. JEPETTO: a Cytoscape plugin for gene set enrichment762
and topological analysis based on interaction networks. Bioinformatics. 2014;30(7):1029–1030.763
79. Saadatpour A, Albert R, Reluga TC. A reduction method for Boolean network models proven to764
conserve attractors. SIAM Journal on Applied Dynamical Systems. 2013;12(4):1997–2011.765
80. Naldi A, Remy E, Thieffry D, Chaouiya C. Dynamically consistent reduction of logical regulatory766
graphs. Theoretical Computer Science. 2011;412(21):2207–2218.767
81. Allen LJ. An introduction to stochastic processes with applications to biology. CRC Press; 2010.768
82. Li J, Poi MJ, Tsai MD. Regulatory mechanisms of tumor suppressor P16INK4A and their relevance769
to cancer. Biochemistry. 2011;50(25):5566–5582.770
83. Yamakoshi K, Takahashi A, Hirota F, Nakayama R, Ishimaru N, Kubo Y, et al. Real-time in vivo771
imaging of p16Ink4a reveals cross talk with p53. The Journal of cell biology. 2009;186(3):393–407.772
Figure Legends773
Figure 1. Gene regulatory network for epithelial carcinogenesis. Nodes represent genes, and774
arrows (bars) represent experimentally characterized activation (arrow-heads) or repression (flat-heads)775
interactions. Genes corresponding to TFs are represented by squares and the rest by circles. (a) Colors776
indicate association with specific phenotypes and processes: epithelial (green), mesenchymal (orange),777
inflammation (red), senescence and DNA damage (blue), cell–cycle (purple), and polycomb complex778
(yellow). (b) Core gene regulatory module in the context of the global network. Colored nodes represent779
the final set of molecules obtained after the network reduction methodology was applied (see Methods)780
and which were included in the core GRN model.781
Figure 2. Core gene regulatory network module for epithelial carcinogenesis Nodes represent782
either single or subsets of genes (see Results); arrows-heads represent activations and flat–heads repression783
interactions. Five of the nodes are involved in the specification of the cellular phenotypes: Epithelial (Ese–784
2), Senescent (p16, p53), and Mesenchymal stem–like (Snai2, TELasa). Three nodes are tightly associated785
with cell–cycle regulation (Rb, E2F, Cyclin), while node NF–κB represents cellular inflammation.786
Figure 3. The core gene regulatory module in the context of the Hallmarks of Cancer approach.787
The antagonistic activity state ESE–2 (-) and Snai2 (+) enable cells to sustain proliferative signals and788
evade growth suppressors by undergoing a dedifferentiation process. The state p16(-), Rb(-), p53(-), and789
TELasa (+) enable cell to acquire replicative immortality, resist cell death, as well as present genome790
instability and a mutation–prone phenotype by surpassing cellular senescence. High levels of cytokines791
and NF–κB(+) expose cells to tumor promoting inflammation. The constitutive activity of Snai2(+)792
epitomizes the intrinsic phenotypic features of the cells emerging from the process of inflammation–793
induced EMT: activating invasion, avoiding immune destruction, and deregulating cellular energetics.794
20
Figure 4. Temporal sequence and global order of cell–fate attainment pattern under the795
stochastic Boolean GRN model during epithelial carcinogenesis. (a) Maximum probability796
p of attaining each attractor, as a function of time (in iteration steps). Vertical lines mark the time797
when maximal probability of each attractor occurs. The most probable sequence of cell attainment is:798
epithelial(E)→ senescent(S)→ mesenchymal(cancer–like)(M). The value of the error probability used in799
this case was ξ = 0.05. The same patterns were obtained with the 3 different error probabilities tested800
(data not shown). (b) Schematic representation of the possible transitions between pairs of attractors.801
Arrows indicate the directionality of the transitions. Above each arrow a sign (+) or (−) indicates802
whether the calculated net transition rate between the corresponding attractors is positive or negative.803
Red arrows represent the globally consistent ordering for the 3 attractors: the order of the attractors in804
which all individual transition has a positive net rate, resulting in a global probability flow across the805
EL.806
Supporting Information captions807
Text S1. Supplementary text including detailed methodology and definitions.808
Text S2. Supplementary text including calculated transition probability, MFPT, and net809
transition rate matrices.810
Figure S1. Network topological gene set analysis results.811
Figure S2. Recovered attractors in mutant conditions.812
Tables813
21
KEGG – Pathway or Process XD–score q-value Overlap/Size
Bladder cancer 1.1447 0 12/38
Chronic myeloid leukemia 0.86866 0 17/69
p53 signaling pathway 0.78477 0 14/62
Pancreatic cancer 0.68155 0 14/70
Glioma 0.68155 0 12/60
Non–small cell lung cancer 0.66586 0 10/51
Melanoma 0.65574 0 12/62
Small cell lung cancer 0.56447 0 14/82
Prostate cancer 0.54821 0 14/84
Cell cycle 0.54821 0 20/120
Cytosolic DNA–sensing pathway 0.48155 0.00001 6/40
Thyroid cancer 0.36155 0.00784 3/25
NOD-like receptor signaling pathway 0.35612 0.00001 7/59
GO Biological Process XD-score q–value Overlap/Size
replicative senescence 3.13328 0 8/10
cellular senescence 0.73328 0.02244 2/10
cell aging 0.43328 0.00608 3/24
activation of NF–κB–inducing kinase activity 0.43328 0.04656 2/16
determination of adult lifespan 0.33328 0.40382 1/10
epithelial cell differentiation 0.32721 0.13188 2/33
positive regulation of NF–κB transcription factor activity 0.30109 0 8/87
Table 1. Significant pathways and processes according to network–based gene set
enrichment analysis
Cellular Phenotype Recovered Attractor (Active) “Expected Attractors” References
Epithelial Ese–2, NF-κB, E2F, Cyclin Ese–2, NF–κB, Cell Cycle(+) [29]
Senescent p16, p53, Ese–2, NF–κB, Rb p16, p53, NF–κB, Cell Cycle(-) [42, 82,83]
Mesenchymal stem-like Snai2, Telomerase, NF–κB, Cyclin Snai2, Telomerase, NF–κB, Cell Cycle(+) [29,42]
Table 2. Predicted and Observed Attractors
EED
p16
SUZ12
C-MYC
BMI1
E2F
EGF
Cyclin
ESE-1 HER2
ESE-2
ESE-3
CDK4
p21
p14
PML
p53
EZH2 CDK6
MDM2
TNF-a
Chk2
ATM
Chk1
IL-6
TAK1
TELun
TELasa
ATR
Snai2
Rb
CBX7
CDK2
Snail
Twist
RKIP
IkB
FOXC2
NF-kB
TGF-b
Zeb
a)
Twist
CBX7
ESE-3
Snai2
CDK2
Rb
Snail
ESE-2
E2F
Cyclin
p16
HER2
SUZ12
EED
ESE-1
EGF
C-MYC
BMI1
ATR
Chk2
Zeb
p53
ATM
PML
Chk1
TELun
RKIP
TELasa
IkBNF-kB
TNF-a
IL-6
TAK1
FOXC2
TGF-bEZH2 CDK6
MDM2
p14
p21
CDK4
b)
TELasa
Rb
p16
ESE-2
E2F
Cyclin
Snai2
p53
NF-kB
NFkB(+)
Promoting inflammation 
Enabling replicative 
immortality
p16(-) TELase(+)Genome instability 
and mutation 
Sustaining proliferative 
signaling 
Rb(-)
Cyclin(+)p53(-)Resisting cell death
Deregulating cellular energetics 
Immune system evasion
Activating invasion
 and metastasis
Evading growth suppressors Snai2(+)ESE-2(-)
Epithelial
+ +
+
-
--
Senescent
Mesenchymal
"Cancer"
E
S
M
M
S
+ +
+ -
S
E
M
M
E
- +
+ -
M
E
S
S
E
-
-
+
-
b)
a)
E S M
1
Methods for Characterizing the Epigenetic
Attractors Landscape Associated with
Boolean Gene Regulatory Networks
Davila-Velderrain J 1,2,∗, Juarez-Ramiro L3, Martinez-Garcia JC 3, and
Alvarez-Buylla ER 1,2,∗
1Instituto de Ecologı́a, Universidad Nacional Autónoma de México, Cd.
Universitaria, México, D.F. 04510, México
2 Centro de Ciencias de la Complejidad (C3), Universidad Nacional Autónoma de
México, Cd. Universitaria, México, D.F. 04510, México
3 Departamento de Control Automático, Instituto Politécnico Nacional, A. P.
14-740, 07300 México, DF, México
Correspondence*:
Elena Alvarez-Buylla
Instituto de Ecologı́a, Universidad Nacional Autónoma de México, Cd.
Universitaria, México, D.F. 04510, México, eabuylla@gmail.com
Jose Davila-Velderrain
Instituto de Ecologı́a, Universidad Nacional Autónoma de México, Cd.
Universitaria, México, D.F. 04510, México, jdjosedavila@gmail.com
ABSTRACT2
Gene regulatory network (GRN) modeling is a well established theoretical framework for the3
study of cell-fate specification during developmental processes. Recently, dynamical models4
of GRNs have been taken as a basis for formalizing the metaphorical model of Waddingtons5
epigenetic landscape, providing a natural extension for the general protocol of GRN modeling.6
In this contribution we present in a coherent framework a novel implementation of two previously7
proposed general frameworks for modeling the Epigenetic Attractors Landscape associated with8
boolean GRNs: the inter-attractor and inter-state transition approaches. We implement novel9
algorithms for estimating inter-attractor transition probabilities without necessarily depending on10
intensive single-event simulations. We analyze the performance and sensibility to parameter11
choices of the algorithms for estimating inter-attractor transition probabilities using three real12
GRN models. Additionally, we present for the first time, a side-by-side analysis of the two13
frameworks and show how the methods complement each other using a real case study: a14
cellular-level GRN model for epithelial carcinogenesis. We expect the toolkit and comparative15
analyzes put forward here to be a valuable additional resource for the systems biology16
community interested in modeling cellular differentiation and reprogramming both in normal and17
pathological developmental processes.18
19
Keywords: Gene regulatory network, Epigenetic landscape, system dynamics, stochastic model, attractors, cell-fate decision,20
development21
1
Davila-Velderrain et al. Characterizing the Epigenetic Attractors Landscape
1 INTRODUCTION
The postulation of experimentally grounded gene regulatory network (GRN) dynamical models, their22
qualitative analysis and dynamical characterization in terms of control parameters, and the validation of23
GRN predictions against experimental observations has become a well established framework in systems24
biology – see, for example: Mendoza and Alvarez-Buylla (1998); Espinosa-Soto et al. (2004); Huang25
et al. (2007); Davila-Velderrain et al. (2015a). There are multiple tools available for the straightforward26
implementation and analysis of dynamical models of GRNs Azpeitia et al. (2014). These models are well-27
suited for the study of cell-fate specification during developmental processes. More recently, dynamical28
models of GRNs have been taken as a basis for formalizing a century-old developmental metaphor:29
Waddington’s epigenetic landscape Waddington (1957); Alvarez-Buylla et al. (2008); Huang (2012);30
Villarreal et al. (2012); Davila-Velderrain et al. (2015c). The present authors recently introduced the31
term Epigenetic Attractors Landscape (EAL) in order to distinguish this modern view of the EL from its32
metaphorical counterpart (see Davila-Velderrain et al. (2015b)). Accordingly, here we will refer as EAL33
to a group of dynamical models grounded in dynamical systems theory and which operationally define an34
underlying EL associated with GRN dynamics. In this contribution we focus on the EAL associated with35
boolean GRNs.36
Despite growing interest in modeling the EAL, as evidenced by recent model proposals in the study37
of stem cell differentiation Li and Wang (2013) and reprogramming Wang et al. (2014b), as well as the38
study of carcinogenesis Wang et al. (2014a); Zhu et al. (2015) and cancer therapeutics Choi et al. (2012);39
Wang (2013); unlike the case of GRNs, there are no available tools for the straightforward implementation40
of EALs. Furthermore, different EAL models have not been compared directly through side-by-side41
analysis of the same biological system. This has arguably precluded the wide-spread applicability of42
EALs.43
One of the first methodological frameworks proposed to explore the EAL associated with a Boolean44
GRN was presented by Alvarez-Buylla and collaborators Alvarez-Buylla et al. (2008). Briefly, in its45
original form this framework rests on three steps: (1) introducing stochasticity into the boolean dynamics46
by means of the so-called stochasticity in nodes model (SIN), (2) estimating an inter-attractor transition47
probability matrix by simulation, and (3) analyzing the temporal evolution of the probability distribution48
over attractor states (see methods). For the purpose of this contribution, we refer to such framework as49
the inter-attractor transition approach (IAT). Recently, a related framework was presented by Zhou and50
collaborators Zhou et al. (2014a). The main differences between this and the former method are: the51
latter (1) precludes simulation by introducing stochasticity directly into a deterministic transition matrix,52
and (2) it is based on the estimation of a inter-state transition probability matrix. We refer to this latter53
framework as the inter-state transition approach (IST). Additionally, Zhou and collaborator introduced54
the idea of a global ordering of attractors in the EAL defined by analyzing the relative stability of attractor55
states Zhou et al. (2014b).56
In this contribution we present in a coherent framework a novel implementation of the two57
methodologies, as well as associated analysis tools such as the global ordering of the attractors58
based on relative stabilities, the computation of a quasi-potential landscape based on an stationary59
probability distribution, and additional tools for downstream analyzes and plotting. We use the popular R60
statistical programming environment (www.R-project.org). For the first framework (IAT), we implement61
novel algorithms for estimating inter-attractor transition probabilities without necessarily depending on62
intensive single-event simulations. For both frameworks (IAT and IST) we exploit the vector-based63
programming capability of the R language. We analyze the performance and sensibility to parameter64
choices of the algorithms for estimating inter-attractor transition probabilities using three GRN models:65
the Arabidopsis (1) root stem cell niche and (2) early flower development GRNs; and a cellular-level66
GRN model for epithelial carcinogenesis. Additionally, for the latter model we present for the first67
time, a side-by-side analysis of the two frameworks and show how the methods complement each other.68
Importantly, we show that the attractor time-ordered transitions obtained by directly estimating an inter-69
attractor transition matrix are consistent with the global ordering of the attractors obtained by means of an70
This is a provisional file, not the final typeset article 2
Davila-Velderrain et al. Characterizing the Epigenetic Attractors Landscape
their corresponding relative stabilities. All the necessary codes for applying the methods showed herein71
are made publicly available; we expect this toolkit to be a valuable additional resource for the systems72
biology community.73
2 RESULTS
2.1 CHARACTERIZING THE EPIGENETIC ATTRACTORS LANDSCAPE
In this work we organize previously existing, yet dispersed, mathematical analyzes into a coherent74
framework for the characterization of EAL associated with Boolean GRNs. Figure 1 schematically75
represents a general work flow for such characterization. The work flow is supposed to be applicable76
to an already available and validated experimentally grounded Boolean GRN model (see Azpeitia et al.77
(2014)). The first necessary step (Fig. 1a) consist on characterizing the state-space associated with the78
GRN in terms of the attained attractors and their basins, a standard practice in the dynamical analysis of79
Boolean GRNs (see methods). The second main step consists on estimating either a inter-attractor or inter-80
state transition probability matrix (or both) (Fig. 1b). The former is the main mathematical structure for the81
IAT aproach, and the latter for the IST approach (see methods). Downstream analyzes of the underlying82
EAL such as the temporal-order of attractor attainment, the attractor relative stability and global ordering,83
and the construction of a probabilistic landscape are based on the transition matrices and can be applied84
afterwards (Fig. 1c).85
2.2 INTER-ATTRACTOR TRANSITIONS
A first necessary step in order to explore the EAL associated with a Boolean GRN using the IAT approach86
is to calculate the probabilities of transition from one attractor to another. In this contribution we present87
two algorithms for such task (see methods). Algorithm 1 implements what we will refer to as an intuitive88
mapping-guided random walk in state space. The reasoning is as follows. An initial state is taken at89
random, which is then mapped to a next state using the stochastic mapping in Equation (3). The basins90
corresponding to the two states are recorded in order. Subsequently, another state is picked at random91
from the latter basin and the mapping procedure is repeated. The procedure is repeated Nsteps number92
of times, each time taking at random a state from the present basin, and the goal is to record a stochastic93
realization of the transitions from one basin to another. Algorithm 2, on the other hand, considers all the94
possible states, repeats them Nreps number of times in a single data structure, and maps them using95
Equation (3) as well (for details, see methods). An important technical issue is then how to select the96
parameters Nsteps and Nreps, respectively. Specially, because this type of simulation approaches have97
been qualified as requiring large number of time-consuming sampling Zhou et al. (2014a).98
For each algorithm we tested how the estimate of the inter-attractor transition matrix changes as the99
parameter value increases. We used three real GRN models for testing: Arabidopsis single-cell root stem100
cell niche GRN (root-GRN) Azpeitia et al. (2010), Arabidopsis floral organ determination GRN (flower-101
GRN) Azpeitia et al. (2014), and a cellular-level GRN model for epithelial carcinogenesis (cancer-GRN).102
We found that for models of size common to GRN developmental modules (i.e., 8 − 15 genes) the103
estimation obtained with small values of the parameter rapidly converges to that obtained by using large104
values (e.g., ≈ 106). Figure 2 shows how the distance between the estimate obtained using a value105
Nsteps(Nreps) = i and that obtained using Nsteps = 106 and Nreps = 103 for Algorithms 1 and106
2, respectively. These results correspond to the three GRN models: root (Fig. 2a-b), cancer (Fig. 2c-d),107
and flower (Fig. 2e-f). Additionally, we show that the estimate obtained with one of the algorithms also108
rapidly converges to that obtained with the other algorithm. Figure 3 shows how the distance between the109
estimate obtained using one algorithm with a parameter value i and that obtained using the other algorithm110
with a large parameter value decreases as i increases. Based on this latter analysis we conclude that, for111
GRNs of sizes 8−15 genes, using a value of the order of Nsteps = 104 for algorithm 1 and Nreps = 102112
Frontiers 3
Davila-Velderrain et al. Characterizing the Epigenetic Attractors Landscape
would be sufficient to achieve an accuracy similar to that achieved using large values (i.e, 106 and 103,113
respectively).114
2.3 CHARACTERIZING THE EAL
In this section we provide as an example the analysis of the EAL underlying a cellular-level GRN115
model for epithelial carcinogenesis. The details of the construction and validation of such network116
model are being published by the authors elsewhere. The GRN comprises 9 main regulators of epithelial117
carcinogenesis (Fig. 4), and its dynamical characterization uncovers 3 fixed-point attractor corresponding118
to the epithelial, senescent, and messenchymal stem-like cellular phenotypes. We applied the two119
approaches (IAT and IST) to the cancer-GRN, and for the IAT approach we applied the two algorithms120
proposed herein. Accordignly, we estimated two inter-attractor transition matrices and one inter-state121
transition matrix. For simplicity in all cases we kept fixed a single value for the error parameter ξ = 0.05.122
Using the estimated matrices, we applied the downstream analyzes depicted in Figure 1c. Figure 5 shows123
two graphs plotting the temporal evolution of the occupation probability distribution over attractor states124
epithelial (black), senescent (red) and messenchymal (green) – conditioned on an initial distribution where125
all the cellular population is in the epithelial attractor state. The uncovered attractor time-order is indicated126
by sequential vertical lines: the order is epithelial → senescent → messenchymal. Importantly, the two127
algorithms give the same qualitative result.128
Subsequently, we uncovered the global ordering of attractors by calculating the relative stabilities and129
net transition rates between pairs of attractors using the two inter-attractor transitions estimated with the130
two algorithms (for details, see methods). Figure 6 shows the plot of two graphs where an arrow appears131
in color red if the calculated transition rate between the attractor is positive in the indicated direction. The132
global ordering corresponds to the path comprised by directed arrows passing by the three attractors, here:133
epithelial→ senescent→ messenchymal. Thus, the global ordering is consistent with the attractor time-134
order, as long as the latter is conditioned on having the total probability mass in the epithelial attractor as135
initial state. Again, the two algorithms produce the same qualitative result.136
Finally, we used the estimated inter-state transition matrix obtained with the IST approach to derive137
a graphical probabilistic landscape (see methods). The landscape is based on the stationary probability138
distribution uss obtained by numerical simulation (see methods). Figure 7 and 8 show a 3D-surface and a139
contour plot respectively. The graphical landscape was derived by first mapping all the state vectors in the140
sate-space into a low dimensional space by the dimensionality reduction technique principal component141
analysis. The first two component are taken as the coordinates in the 3D plot, where the z-coordinate142
corresponds to the values −log(uss). The surface is inferred by interpolating the spaced data points using143
the technique of thin plate spline regression Furrer et al. (2009). The 3D-surface plot nicely shows the144
relative stability of the states by means of their probability, the lower states begin more stable. The route145
from the attractors of less stability to that with the highest consists with the global ordering uncovered146
above. However, in the case of the IST transition and the probabilistic landscape we have additional147
information concerning the relative stability of all the transitory states in state space.148
3 DISCUSSION
Boolean GRN models are well-established tools for the mechanistic study of the establishment of cellular149
phenotypes during developmental dynamics. Their simplicity and deterministic nature are well-suited150
for answering questions regarding the sufficiency of molecular players and interactions necessary to151
explain observed cellular phenotypes. In the present contribution we present methods to study an extended152
Boolean GRN model which take stochasticity into consideration, necessary for studying cell-state153
transition events.154
In the case of the stochastic Boolean GRNs, the model of interest involves random samples with a non-155
trivial dependence structure. In such cases, efficient simulation algorithms are needed in order to explore156
This is a provisional file, not the final typeset article 4
Davila-Velderrain et al. Characterizing the Epigenetic Attractors Landscape
and characterize the underlying structure and to understand the behavioral (dynamical) consequences157
of the constrains imposed by such structure. Accordingly, we propose two algorithms of general158
applicability, and show how these can be used to estimate transitions probabilities in an efficient way159
from moderate size GRNs similar to those proposed as developmental modules driving developmental160
processes. Although we show that the two algorithms generate consistent estimates, one or the order may161
be prefer depending on the GRN in question and the computational resources at hand. Algorithm 1 is162
likely to be preferred in the case of larger GRNs, as it is not constrained by the size of the GRN per se, but163
the number of steps chosen in the simulation. On the other hand, given the declarative representation used164
in algorithm 2, its performance is constrained by the available of memory. Algorithm 2, however, may be165
preferred for fast estimates in small to moderate size GRNs (¡15 genes). Importantly, although we tested166
the performance of the algorithms in terms of the number of steps chosen for the simulations, the results167
should not be generalized without caution given that we only used three real GRNs, and the results may168
vary either for larger GRNs or sate spaces with more complex structures.169
For illustrative purposes we applied all the methods and downstream analyzes presented herein to a170
specific GRN: a cellular-level GRN model for epithelial carcinogenesis. We show that for this case,171
the uncovered temporal-order of attractor attainment is consistent with the global ordering based on172
relative stability, both calculated from a inter-attractor transition probability matrix. The result of the173
former is conditioned on the initial occupation probability taken. An interesting open problem would be174
to generalize this relationship using GRNs with divers structures, for example to ask if the global ordering175
of attractors is robust enough as to drive most initial distributions into a consistent temporal ordering.176
An additional interesting questions would be, what does this relationship tells us about the structural177
constraints imposed by the GRN. The tools and implementation presented here may prove useful for such178
theoretical studies.179
Finally, we present tools for deriving a probabilistic landscape from an estimated inter-state transition180
matrix in terms of the stationary probability distribution over state space. This latter analysis and the181
associated graphical tools can be applied to systematically study how the system responds to perturbations182
resulting in a reshaped EAL. Structural alterations of the EAL may predict the induction of preferential183
cell-state transitions such as the case of reprogramming strategies Zhou and Huang (2011) or therapeutic184
interventions against the stabilization of a cancer attractor Huang and Kauffman (2013); Wang (2013).185
Overall, in this contribution we present in a coherent framework a novel implementation of general186
frameworks for modeling the Epigenetic Attractors Landscape associated with boolean GRNs. We provide187
analysis of the method performance and show how they can be applied to real case GRNs. we expect the188
toolkit and comparative analyzes put forward here to be a valuable additional resource for the systems189
biology community interested in modeling cellular differentiation and reprogramming both in normal and190
pathological developmental processes.191
4 MATERIAL & METHODS
BOOLEAN GENE REGULATORY NETWORKS
A Boolean network models a dynamical system assuming both discrete time and discrete state variables.192
This is expressed formally with the mapping:193
xi(t+ 1) = Fi(x1(t), x2(t), ..., xk(t)), (1)
where the set of functions Fi are logical prepositions (or truth tables) expressing the relationship between194
the genes that share regulatory interactions with the gene i, and where the state variables xi(t) can take195
the discrete values 1 or 0 indicating whether the gene i is expressed or not at a certain time t, respectively.196
Frontiers 5
Davila-Velderrain et al. Characterizing the Epigenetic Attractors Landscape
A completely specified Boolean GRN model is analyzed by either of two methods: (1) by exhaustive197
computational characterization of the state space in terms of attained attractors and their basins of198
attractions (used in IAT), or (2) by defining a matrix explicitly encoding the mapping in Equation (1)199
(used in IST). Specifically, for the latter method, following Zhou et al. (2014b) the mapping in Equation200
(1) is used to define a single-step 2n x 2n transition matrix T with elements ti,j , where:201
ti,j =
{
1, xj = F(xi)
0, Otherwise.
(2)
Here xi is the network state i from the state-space of size 2n corresponding to a network of n genes, and202
F represents the vector of n functions represented element-wise in Equation (1). Given the deterministic203
character of the mapping in Equation (1), the matrix T is sparse, each row i having only one element where204
ti,j = 1. The matrix T constitutes a declarative representation which includes the complete information205
of the mapping in Equation (1): the matrix T assign to each of the states xk, where k ∈ {1, ..., 2n}, its206
corresponding state in time t+ 1.207
INTER-ATTRACTOR TRANSITION APPROACH
Including Stochasticity208
Following Alvarez-Buylla et al. (2008); Azpeitia et al. (2014); Davila-Velderrain et al. (2015b), a209
Boolean GRN is extended into a discrete stochastic model by means of the so–called stochasticity in210
nodes (SIN) model. In this model, a constant probability of error ξ is introduced for the deterministic211
Boolean functions as follows:212
Pxi(t+1)[Fi(xregi(t))] = 1− ξ,
Pxi(t+1)[1− Fi(xregi(t))] = ξ.
(3)
It is assumed that the probability that the value of the random variable xi(t + 1) (a gene) is determined213
or not by its associated logical function Fi(xregi(t)) is 1−ξ or ξ, respectively. The probability ξ is a scalar214
constant parameter acting independently per gene. The vector xregi represents the regulators of gene i.215
Inter-Attractor Transition Probability Estimation216
An attractor transition probability matrix Π with components:217
πij = P (At+1 = j|At = i), (4)
representing the probability that an attractor j is reached from an attractor i is estimated by either of two218
simulation-based algorithms proposed herein (see results).219
In Algorithm 2, Bin(n = 1, ξ) refers to a binomial distribution given by Bin(k|n, ξ) =
(
n
k
)
ξk(1 −220
ξ)n−k. In the special case used here (with n = 1) the distribution corresponds to a Bernoulli221
distribution. Thus, what we call perturbation indicator vector effectively simulates tossing a biased coin222
Nsteps xn x 2n times. Each outcome x = 1 indicates the position where an error in the mapping has223
occurred, according to Equation (3).224
The elements πij of the matrix Π are obtained as maximum likelihood estimates based on the empirical225
transition probability resulting from the simulations from either algorithm 1 or 2.226
This is a provisional file, not the final typeset article 6
Davila-Velderrain et al. Characterizing the Epigenetic Attractors Landscape
Algorithm 1 Simulate inter-attractor stochastic realization
Initiate storage[Nsteps]
from state space = {1, ..., 2n} pick randomly initial state xi
storage[1]← basin k← map← xi
for (stepN in 2 to Nsteps) do
state xj ← stochastic mapping Eq(2)← state xi
storage[stepN]← basin k← map← xj
from sub space = {basin k} pick randomly state xi
end for
return storage
Algorithm 2 Implicit bit-flip simulation
Initiate storage j x j matrix Π, j ∈ {1, ..., nattractors}
Generate state space = {x1, ...,x2n}
Generate set Xt+1 = F(state space)
Xpert
t+1 ← repeat Xt+1 element-wise Nsteps times
Generate perturbation indicator vector piv:
piv← simulate Nsteps xn x 2n observations from Bin(n = 1, ξ)
for piv[i] = 1 do
Apply error in Xpert
t+1 [i] , i ∈ {1, ..., Nsteps xn x 2n}
end for
Xpert ← split Xpert
t+1 in n-size state vectors xk, k ∈ {1, ..., Nsteps x 2n}
for each xi in state space do
basin j ← map xi
end for
for each xk in Xpert do
basin j ← map xk
end for
update πj,j
return matrix
INTER-STATE TRANSITION PROBABILITY APPROACH
Including Stochasticity227
For the IST approach, stochasticity is introduced in a declaractive manner (i.e., by means of a single228
structure representation) using a binomial distribution Zhou et al. (2014a,b). Specifically, the effect of229
noise on each possible single-state transition is represented by introducing a noise matrix N with elements230
Ni,j =



(
n
dij
)
ξdij (1− ξ)n−dij , i 6= j
0, i = j
(5)
where dij is the Hamming distance between the states i and j (i.e., dij = ‖xi − xj‖H ). This231
representation formalizes an intuitive notion: the effect of noise on the system is more (less) likely to232
produce a state less (more) similar to the initial state.233
234
Inter-State Transition Probability Estimation235
Frontiers 7
Davila-Velderrain et al. Characterizing the Epigenetic Attractors Landscape
A single object including both stochastic perturbations and deterministic mapping is obtained by adding236
the noise matrix N and the deterministic single-step transition matrix T (see Equation 2) as follows237
Π = (1− ξ)nT+N (6)
After normalizing a transition probability matrix Π is obtained with components238
πij = P (xt+1 = j|xt = i). (7)
The components πij represent the probability that a state j is reached from a state i, where i, j ∈239
{1, ...2n}.240
TEMPORAL EVOLUTION OF STATES/ATTRACTORS PROBABILITY
In both approaches (IAT and IST) a sequence of random variables {Ct : t ∈ N} is considered a Markov241
chain (MC). In IAT (IST) CT takes as values the different attractors (states), the elements πi,j representing242
inter-attactor(states) transition probabilities, and the matrix Π the (one-step) transition probability matrix.243
As the probabilities do not depend on time, the MC is homogeneous.244
The occupation probability distribution P (Ct = j) – i.e., the probability that the chain is in state245
(attractor or state) j at a given time t – is denoted by the row vector u(t). The probabilities temporally246
evolve according to the dynamic equation247
u(t+ 1) = u(t)Π. (8)
Taking u(0) as the initial distribution of the MC, the equation reads u(1) = u(0)Π. By linking the248
occupation probabilities iteratively we get u(t) = u(0)Πt: the occupation probability distribution at time249
t can be obtained directly by matrix exponentiation.250
EAL ANALYZES
Temporal-order of Attractor Attainment251
Having obtained the temporal evolution of the occupation probability distribution u(t) given an initial252
distribution u(0) by numerically solving Equation (8), following Alvarez-Buylla et al. (2008), it is253
assumed that the most likely time for an attractor to be reached is when the probability of reaching254
that particular attractor is maximal. Therefore, the temporal sequence in which attractors are attained255
is obtained by determining the sequence in which their maximum probabilities are reached using u(t).256
Probabilistic Landscape257
A stationary probability distribution of a MC is a distribution uss which satisfies the steady state equation258
uss = ussΠ. The stationary probability distribution, if exists, is calculated either by solving the equation259
uss(I − Π) = 0, where I is the nxn identity matrix Wilkinson (2011); or by numerically solving260
Equation (8), as uss corresponds to the long-run distribution of the MC: uss = limt→∞ u(t) Bolstad261
(2011). A probabilistic landscape U – also called a quasi-potential – can be obtaining by mapping the262
distribution uss using −ln(uss). Such landscape reflects the probability of states and it provides a global263
characterization and a stability measure of the GRN system Wang (2015).264
This is a provisional file, not the final typeset article 8
Davila-Velderrain et al. Characterizing the Epigenetic Attractors Landscape
Attractor Relative Stability and Global Ordering Analyses265
A relative stability matrix M is calculated which reflects the transition barrier between any two states266
based on the mean first passage time (MFPT). The transition barrier in the EAL epitomizes the ease for267
transitioning from one attractor to another. The ease of transitions, in turn, offers a notion of relative268
stability. Zhou and collaborators recently proposed that a GRN has a consistent global ordering of all of269
the attractors which can be uncovered by considering their relative stabilities Zhou et al. (2014a,b). A net270
transition rate between attractor i and j is defined in terms of the MFPT as follows:271
di,j =
1
MFPTi,j
−
1
MFPTj,i
(9)
The consistent global ordering of the attractors is defined based on the formula proposed in Zhou et al.272
(2014b). Briefly, the consistent global ordering of the attractors is given by the attractor permutation in273
which al transitory net transition rates from an initial attractor to a final attractor are positive. The MFPTs274
are calculated either by implementing the matrix-based algorithm proposed in ) or by means of numerical275
simulation.276
IMPLEMENTATION
All the methods presented here were implemented using the R statistical programming environment277
(www.R-project.org). The code relies on the following packages: BoolNet, for the dynamical analysis278
oo Boolean neworks Müssel et al. (2010); expm, for matrix computations Goulet et al. (2013); igraph,279
for network analyses Csardi and Nepusz (2006); markovchain for MC analysis and inference; and fields,280
for surface plotting Furrer et al. (2009).281
DISCLOSURE/CONFLICT-OF-INTEREST STATEMENT
The authors declare that the research was conducted in the absence of any commercial or financial282
relationships that could be construed as a potential conflict of interest.283
REFERENCES
Alvarez-Buylla, E. R., Chaos, Á., Aldana, M., Benı́tez, M., Cortes-Poza, Y., Espinosa-Soto, C., et al.284
(2008), Floral morphogenesis: stochastic explorations of a gene network epigenetic landscape, Plos285
one, 3, 11, e3626286
Azpeitia, E., Benı́tez, M., Vega, I., Villarreal, C., and Alvarez-Buylla, E. R. (2010), Single-cell and287
coupled grn models of cell patterning in the arabidopsis thaliana root stem cell niche, BMC systems288
biology, 4, 1, 134289
Azpeitia, E., Davila-Velderrain, J., Villarreal, C., and Alvarez-Buylla, E. R. (2014), Gene regulatory290
network models for floral organ determination, in Flower Development: Methods and Protocols291
(Springer)292
Bolstad, W. M. (2011), Understanding computational Bayesian statistics, volume 644 (John Wiley &293
Sons)294
Choi, M., Shi, J., Jung, S. H., Chen, X., and Cho, K.-H. (2012), Attractor landscape analysis reveals295
feedback loops in the p53 network that control the cellular response to dna damage, Science signaling,296
5, 251, ra83–ra83297
Csardi, G. and Nepusz, T. (2006), The igraph software package for complex network research,298
InterJournal, Complex Systems, 1695, 5, 1–9299
Frontiers 9
Davila-Velderrain et al. Characterizing the Epigenetic Attractors Landscape
Davila-Velderrain, J., Martinez-Garcia, J., and Alvarez-Buylla, E. (2015a), Descriptive vs. mechanistic300
network models in plant development in the post-genomic era, Plant Functional Genomics: Methods301
and Protocols, 455–479302
Davila-Velderrain, J., Martinez-Garcia, J. C., and Alvarez-Buylla, E. R. (2015b), Modeling the epigenetic303
attractors landscape: toward a post-genomic mechanistic understanding of development, Frontiers in304
genetics, 6305
Davila-Velderrain, J., Villarreal, C., and Alvarez-Buylla, E. R. (2015c), Reshaping the epigenetic306
landscape during early flower development: induction of attractor transitions by relative differences307
in gene decay rates, BMC systems biology, 9, 1, 20308
Espinosa-Soto, C., Padilla-Longoria, P., and Alvarez-Buylla, E. R. (2004), A gene regulatory network309
model for cell-fate determination during arabidopsis thaliana flower development that is robust and310
recovers experimental gene expression profiles, The Plant Cell Online, 16, 11, 2923–2939311
Furrer, R., Nychka, D., and Sain, S. (2009), fields: Tools for spatial data, R package version, 6, 11312
Goulet, V., Dutang, C., Maechler, M., Firth, D., Shapira, M., and Stadelmann, M. (2013), expm: Matrix313
exponential, R package version 0.99-0314
Huang, S. (2012), The molecular and mathematical basis of waddington’s epigenetic landscape: A315
framework for post-darwinian biology?, Bioessays, 34, 2, 149–157316
Huang, S., Guo, Y.-P., May, G., and Enver, T. (2007), Bifurcation dynamics in lineage-commitment in317
bipotent progenitor cells, Developmental biology, 305, 2, 695–713318
Huang, S. and Kauffman, S. (2013), How to escape the cancer attractor: rationale and limitations of319
multi-target drugs, in Seminars in cancer biology, volume 23 (Elsevier), volume 23, 270–278320
Li, C. and Wang, J. (2013), Quantifying cell fate decisions for differentiation and reprogramming of a321
human stem cell network: landscape and biological paths, PLoS computational biology, 9, 8, e1003165322
Mendoza, L. and Alvarez-Buylla, E. R. (1998), Dynamics of the genetic regulatory network for323
arabidopsis thaliana flower morphogenesis, Journal of theoretical biology, 193, 2, 307–319324
Müssel, C., Hopfensitz, M., and Kestler, H. A. (2010), Boolnetan r package for generation, reconstruction325
and analysis of boolean networks, Bioinformatics, 26, 10, 1378–1380326
Villarreal, C., Padilla-Longoria, P., and Alvarez-Buylla, E. R. (2012), General theory of genotype327
to phenotype mapping: derivation of epigenetic landscapes from N-node complex gene regulatory328
networks., Physical review letters, 109, 11, 118102329
Waddington, C. H. (1957), The strategy of genes (London: George Allen & Unwin, Ltd.)330
Wang, G., Zhu, X., Gu, J., and Ao, P. (2014a), Quantitative implementation of the endogenous molecular–331
cellular network hypothesis in hepatocellular carcinoma, Interface focus, 4, 3, 20130064332
Wang, J. (2015), Landscape and flux theory of non-equilibrium dynamical systems with application to333
biology, Advances in Physics, 64, 1, 1–137334
Wang, P., Song, C., Zhang, H., Wu, Z., Tian, X.-J., and Xing, J. (2014b), Epigenetic state network335
approach for describing cell phenotypic transitions, Interface Focus, 4, 3, 20130068336
Wang, W. (2013), Therapeutic hints from analyzing the attractor landscape of the p53 regulatory circuit,337
Science signaling, 6, 261, pe5–pe5338
Wilkinson, D. J. (2011), Stochastic modelling for systems biology (CRC press)339
Zhou, J. X. and Huang, S. (2011), Understanding gene circuits at cell-fate branch points for rational cell340
reprogramming, Trends in Genetics, 27, 2, 55–62341
Zhou, J. X., Qiu, X., d’Herouel, A. F., and Huang, S. (2014a), Discrete gene network models for342
understanding multicellularity and cell reprogramming: From network structure to attractor landscapes343
landscape, In: Computational Systems Biology. Second Edition. Elsevier, 241–276344
Zhou, J. X., Samal, A., d’Hèrouël, A. F., Price, N. D., and Huang, S. (2014b), Relative stability of network345
states in boolean network models of gene regulation in development, arXiv preprint arXiv:1407.6117346
Zhu, X., Yuan, R., Hood, L., and Ao, P. (2015), Endogenous molecular-cellular hierarchical modeling of347
prostate carcinogenesis uncovers robust structure, Progress in biophysics and molecular biology348
This is a provisional file, not the final typeset article 10
a)
b)
c)
attractor global orderingattractor time-order
Aa Ab
+
da->b > 0
da->b > 0
Aa
Ab
Aa Ab
probabilistic landscape
Aa Ab
Aa Ab Aa Ab
Aa Ab
Aa
Ab
rab
rba rbb
raa
inter-attractor transition (IAT)
S1 S2
S3 S4 r44
...r11S1
S2
S3
S4
S1S2S3S4
...
...
... ...
r41
r14
inter-state transition (IAT)
xi=1 xi=0
Aa
Ab
S1
S2
S3
S4
x1(t+ 1) = F1(x2)
x2(t+ 1) = F2(x1)
state-space characterization
a) b)
c) d)
e) f)
a) b)
c) d)
e) f)
Rb E2F
p53
TELasa
NF-kB
p16
Snai2
Cyclin
ESE-2
a) Algorithm 1
b) Algorithm 2
Epi Sen Mes
Epi Sen Mes
a) Algorithm 1 a) Algorithm 2
 
 
7.5 7.0 6.5 6.0
PC
2 
0.
5 
0.
0 
-0
.5
 
  
       
PC1 
7.5 
7.0 
6.5 
6.0
A
rticle
Molecular Evolution Constraints in the Floral Organ
Specification Gene Regulatory Network Module across
18 Angiosperm Genomes
Jose Davila-Velderrain,1,2 Andres Servin-Marquez,3 and Elena R. Alvarez-Buylla*,1,2
1Instituto de Ecologı́a, Universidad Nacional Autónoma de México, México, D.F., México
2Centro de Ciencias de la Complejidad, C3, Universidad Nacional Autónoma de México, México, D.F., México
3Facultad de Ciencias Biológicas, Universidad Autónoma de Nuevo León, San Nicolás de los Garza, Nuevo León, México
*Corresponding author: E-mail: eabuylla@gmail.com.
Associate editor: Michael Purugganan
Abstract
The gene regulatory network of floral organ cell fate specification of Arabidopsis thaliana is a robust developmental
regulatory module. Although such finding was proposed to explain the overall conservation of floral organ types and
organization among angiosperms, it has not been confirmed that the network components are conserved at the
molecular level among flowering plants. Using the genomic data that have accumulated, we address the conservation
of the genes involved in this network and the forces that have shaped its evolution during the divergence of angiosperms.
We recovered the network gene homologs for 18 species of flowering plants spanning nine families. We found that all the
genes are highly conserved with no evidence of positive selection. We studied the sequence conservation features of the
genes in the context of their known biological function and the strength of the purifying selection acting upon them in
relation to their placement within the network. Our results suggest an association between protein length and sequence
conservation, evolutionary rates, and functional category. On the other hand, we found no significant correlation
between the strength of purifying selection and gene placement. Our results confirm that the studied robust develop-
mental regulatory module has been subjected to strong functional constraints. However, unlike previous studies, our
results do not support the notion that network topology plays a major role in constraining evolutionary rates. We
speculate that the dynamical functional role of genes within the network and not just its connectivity could play an
important role in constraining evolution.
Key words: gene regulatory network, flower development, molecular evolution, functional constraint.
Introduction
An outstanding goal in molecular evolution is to bridge the
gap between the study of individual molecules and the study
of systems on higher levels of biological organization.
In modern evolutionary studies, the limitations of considering
genes as individual entities upon which evolutionary forces
act independently are becoming generally accepted. The
emerging picture is that in which evolutionary forces, func-
tional constraints, and molecular interactions are condition-
ally dependent on the systems level (Cork and Purugganan
2004). Following this line of research, several studies have
analyzed molecular evolution at the pathway or network
level (see, e.g., Hahn et al. 2004; Alvarez-Ponce et al. 2009;
Jovelin and Phillips 2009; Yang et al. 2009; Montanucci et al.
2011; Alvarez-Ponce 2012). Most studies support the idea
that evolutionary forces acting on genes are in close relation
with the structure/topology of their functional network.
Previous network-based molecular evolutionary studies
have focus on investigating networks in relation to the evo-
lutionary rates of their genes based on large-scale molecular
networks (Fraser et al. 2002; Agrafioti et al. 2005; Hahn and
Kern 2005; Lemos et al. 2005; Alvarez-Ponce and Fares
2012). Recently, similar analysis have been applied to
well-characterized, relatively small pathways (Alvarez-Ponce
et al. 2009, 2011; Casals et al. 2011; Fitzpatrick and O’Halloran
2012; Lavagnino et al. 2012; Invergo et al. 2013). Both
approaches have uncovered interesting yet preliminary pat-
terns (seeMontanucci et al. 2011 and references therein). The
conclusion, so far, appears to be that evolutionary pressures
acting on genes are in close relation with the structure of their
functional network. But contrasting results have been found
in several cases, and when considering the latter, there is no
general consensus for the relationship between network
properties and the molecular evolution of its components:
different patterns have been found for different interacting
systems and different species sets. Thus, the need for resolu-
tion of contrasting results and the search of robust evolution-
ary patterns call for new studies. It has been suggested that
the analysis of new pathways might help to uncover general
patterns and to disentangle topological restrictions of net-
works from the biological properties and functions
(Montanucci et al. 2011). Here, we argue that the study of
the molecular evolution of the genes involved in regulatory
modules that have been uncovered with dynamical gene reg-
ulatory network (GRN) models could help uncover general
evolutionary principles, given that such models allow a
 The Author 2013. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please
e-mail: journals.permissions@oup.com
Mol. Biol. Evol. doi:10.1093/molbev/mst223 Advance Access publication November 21, 2013 1
 MBE Advance Access published December 4, 2013
 at U
N
IV
E
R
S
IT
A
T
 P
O
M
P
E
U
 F
A
B
R
A
 o
n
 Jan
u
ary
 1
2
, 2
0
1
4
h
ttp
://m
b
e.o
x
fo
rd
jo
u
rn
als.o
rg
/
D
o
w
n
lo
ad
ed
 fro
m
 
rigorous distinction between structure and function. In con-
trast to schematic representations that depict gene regulatory
interactions, dynamic models may consider the nonlinear
aspects of regulation and explore the way gene expression
changes in time, both in wild-type and perturbed simulated
systems (Alvarez-Buylla et al. 2010). Nevertheless, to the best
of our knowledge, a network-based molecular evolutionary
study is lacking for the case of experimentally grounded and
functionally validated dynamic GRN models.
It is generally accepted that GRNs are underlyingmolecular
systems orchestrating developmental processes (Huang and
Kauffman 2009; Alvarez-Buylla et al. 2010). On the other
hand, it has been suggested that the specific nature of evo-
lutionary forces acting on the component genes depends
largely on the function of the interacting system (Cork and
Purugganan 2004). In this work, we follow a similar approach
to that of previous network-level evolutionary studies; but
instead of analyzing a new metabolic pathway, we focus on
the molecular evolution and network properties of a well-
studied GRN module: the experimentally grounded floral
organ cell fate specification determination GRN (FOS-GRN)
(see Espinosa-Soto et al. 2004; Alvarez-Buylla et al. 2010 for
updates).
The FOS-GRN (fig. 1) integrates molecular genetic data for
the ABC genes and their main interactors in A. thaliana. This
GRN includes key regulators underlying the transition from
the shoot apical meristem once it produces the apical inflo-
rescence meristem with the flower primordia in its flanks
(flowering locus t [FT], terminal flower1 [TFL], embryonic
flower1 [EMF1], LEAFY [LFY], APETALA1 [AP1], fruitfull
[FUL]), the ABCs and some of their interacting genes
(APETALA1 [AP1], APETALA3 [AP3], PISTILLATA [PI],
APETALA2 [AP2], AGAMOUS [AG], SEPALLATA [SEP]), as
well as some genes that link floral organ specification to
other modules regulating primordia formation and homeo-
stasis (AG andWUS) and to some regulators of organ bound-
aries (UFO). From the 15 genes, 6 are members of the MADS-
box protein family (AG, AP1, AP3, PI, SEP, FUL) and belong to
five different subfamilies (AG, SQUA, GLO, DEF, and AGL2)
within the clades of MADS-box genes (Becker and Theissen
2003).
Themodel was proposed on the basis of experimental data
for these 15 genes in the model plant A. thaliana. Among the
15 genes, 5 are grouped into three classes (A-type, B-type, and
C-type) whose combinations, described by the ABC model,
are necessary for floral organ cell specification (Coen and
Meyerowitz 1991). A-type genes (AP1 and AP2) are necessary
for sepal specification, A-type together with B-type (AP3 and
PI) for petal specification, B-type and C-type (AGAMOUS) for
stamen specification, and the C-type gene (AG) alone for
carpel primordia cell specification. Although the ABC
model of flower development was published more than 20
years ago, it was just recently that the model of the FOS-GRN
provided a sufficient explanation for the observed ABC pat-
terns and the stable gene expression configurations observed
during early flower development in Arabidopsis (Mendoza
and Alvarez-Buylla 1998; Espinosa-Soto et al. 2004; and up-
dates and review in Alvarez-Buylla et al. 2010). The network
has been studied from different perspectives (Alvarez-Buylla
et al. 2008; Sanchez-Corrales et al. 2010; Villarreal et al. 2012),
and the results of multiple studies have shown that its dy-
namical behavior is robust enough as to predict the observed
phenotypes both in wild-type and several mutant conditions.
In other words, there is enough evidence to sustain the claim
that the 15 genes involved in the network form a core regu-
latory module responsible for primordial cell fate determina-
tion during early stages of flower development. We reasoned
that such a functional constraint could play a strong role in
constraining evolutionary rates at the molecular level. Based
on this idea, here we addressed whether orthologous genes of
the FOS-GRN were found and conserved in distantly related
angiosperm species, and then we addressed the evolutionary
forces that could have shaped its evolution under the hypoth-
esis that positive Darwinian selection would not be a prevail-
ing force.
A large number of the genes involved in floral develop-
ment belong to the eukaryotic MADS-box gene family
(Riechmann et al. 1997). Most studies on the molecular
basis of floral development focus on these genes, particularly
floral homeotic genes such as AGAMOUS (AG), APETALA3
(AP3), PISTILLATA (PI), and several AGAMOUS-like genes
(Lawton-Rauh et al. 2000). Background information on
genetic and expression analyses indicate that members of a
floral homeotic gene group tend to share similar develop-
mental functions in flower and inflorescence morphogenesis
(Purugganan et al. 1995; Purugganan 1997), thus reflecting
high conservation among evolutionarily related regulatory
genes. Previous studies on the evolutionary forces acting on
some of the genes involved in flower development have fo-
cused on intraspecific population genetics data (Purugganan
and Suddith 1999) or data from two closely related species
(Yang et al. 2011). These studies have shown that although
most floral genes have evolved under strong purifying
AG
AP1
AP3
PI
TFL1
WUS
LFY
FUL
AP2
EMF1
FT
UFO
SEP3
CLFLUG
FIG. 1. Graph representation of the FOS-GRN. Arrows and blunt-
ended edges correspond to activating and repressing interactions,
respectively.
2
Davila-Velderrain et al. . doi:10.1093/molbev/mst223 MBE
 at U
N
IV
E
R
S
IT
A
T
 P
O
M
P
E
U
 F
A
B
R
A
 o
n
 Jan
u
ary
 1
2
, 2
0
1
4
h
ttp
://m
b
e.o
x
fo
rd
jo
u
rn
als.o
rg
/
D
o
w
n
lo
ad
ed
 fro
m
 
selection, some show elevated nonsynonymous substitution
rates and/or positively selected sites. However, given that
these molecular evolutionary studies have focused mostly
on closely related species, it is not known whether the com-
plete set of genes conforming the FOS-GRN are globally con-
served among flowering plants. In order to first explore this
possibility, here we follow a comparative genomics approach,
and, unlike previous work, we study the molecular evolution
of the network over a broad taxonomic distance involving
monocots and dicots; the recent completion, annotation, and
analysis of the genomes of several flowering plant species has
provided the opportunity to do so.
In summary, the aim of this work was 3-fold: 1) to explore
the degree of conservation of the genes involved in the FOS-
GRN, 2) to uncover the prevailing molecular evolutionary
forces acting upon its genes, and 3) to study the evolutionary
constraints that its network properties and known biological
function impose to the molecular evolution of its compo-
nents. With this in mind, we first searched for the homologs
of the genes in the A. thaliana FOS-GRN in all the flowering
species with a sequenced and annotated genome available
(a total of 18; see fig. 2 for the species used and their place-
ment in angiosperm phylogeny). With the sequence data for
the FOS-GRN genes, we measured the action of selective
pressures on individual protein-coding genes through the
estimation of synonymous and nonsynonymous substitution
rates (dS and dN, respectively) when comparing among spe-
cies. The ratio dN/dSmeasures the strength and nature of the
evolutionary forces indicating positive selection, neutral evo-
lution, or purifying selection when it is higher, equal, or lower
than 1, respectively. Both an overall ratio for the entire coding
sequence of a gene and estimates considering variation of the
ratio among sites were calculated (Yang and Bielawski 2000).
We then calculated molecular conservation features other
than evolutionary rates for each gene and asked whether
these features in addition to the evolutionary parameters
(dN, dS, dN/dS) show a pattern of association with the
known biological functions of the genes. Finally, we addressed
whether the forces that have shaped the evolution of the
genes during the divergence of angiosperms were correlated
to the placement of each gene within the FOS-GRN.
Results
Identification of the FOS-GRN Genes in Flowering
Plant Genomes
The experimentally grounded FOS-GRN proposed by
Espinosa-Soto et al. was used as a reference (Espinosa-Soto
et al. 2004; and updated in Alvarez-Buylla et al. 2010). The
original network proposed for A. thaliana has 15 genes and
their regulatory (activating or inhibitory) interactions (supple-
mentary table S1, SupplementaryMaterial online). In order to
study the conservation of the genes in the network across
species, we conducted homology analysis using the Plaza
Comparative Genomics Platform (Proost et al. 2009) (see
Materials and Methods). For each gene in the network, a
total of 418 putative homologs (orthologs and in-paralogs)
of the 15 A. thaliana (Ath) FOS-GRN genes were identified
in the genomes of the other 17 flowering plant species:
Arabidopsis lyrata (Aly), Brachypodium distachyon (Bdi),
Carica papaya (Cpa), Fragaria vesca (Fve), Glycine max
(Gma), Lotus japonicus (Lja), Malus domestica (Mdo),
Manihot esculenta (Mes), Medicago truncatula (Mtr), Oryza
sativa japonica (Osj), Oryza sativa indica (Osi), Populus tricho-
carpa (Ptr), Ricinus communis (Rco), Sorghum bicolor (Sbi),
Theobroma cacao (Tca), Vitis vinifera (Vvi), and Zea mays
(Zma) (see fig. 2). These results correspond to the preliminary
network conservation data and were organized in the form of
a conservation matrix (also called phylogenetic profile) where
each row represents a gene vector composed by a set of
characters {0, 1, 2, 3, 4} representing the absence (0), presence
(1), or the total number of in-paralogs (2, 3, 4) of each gene;
and each column represents a species (supplementary table
S2, Supplementary Material online). All FOS-GRN genes stud-
ied, with the exception of EMF1, have orthologs in all 18
genomes. The gene EMF1 was not found as an ortholog of
the EMF1 gene inA. thaliana (AT5G11530) among themono-
cot plants: B. distachyon, O. sativa japonica, O. sativa indica, S.
bicolor, and Z. mays. However, following the same method-
ology, but using instead the corresponding protein sequence
of the gene EMF1 reported for O. sativa (OS01G12890) as
query, putative orthologs were found in all four cases. For
the only case of this gene (EMF1), it was discovered that
there exists one orthologous group for dicots and a different
group for monocots. The relationship between both groups is
not clear and will be studied in subsequent studies.
Manual Curation of Putative In-paralogs
The preliminary conservation data of the proteins in the
FOS-GRN of A. thaliana were manually curated to produce
the final conservation data of the proteins in the FOS-GRN
reported here in the form of a conservationmatrix (fig. 2) and
the corresponding list of gene IDs (supplementary table S4,
Supplementary Material online). Certain proteins were elim-
inated from the list due to evidence of partial gene copies or
annotation errors (see Materials and Methods). We found
that all the FOS-GRN genes have homologs in all 18 genomes
searched. Results also show that all the genes underwent
a number of duplication and/or loss events. The detailed
evolutionary processes (e.g., duplication, loss, and pseudogen-
ization) leading to the expansion of the network across
angiosperms will be explored in a future study.
Molecular Evolutionary Analysis of the FOS-GRN
The nonsynonymous (dN) to synonymous (dS) substitution
rate ratio (dN/dS) was calculated in order to infer the impact
of natural selection on the FOS-GRN. The values of the overall
ratio dN/dS range from 0.05936 for PI to 0.39577 for EMF1,
suggesting that purifying selection or selection constraint best
explains the evolution of the genes in the FOS-GRN (table 1).
Given that the estimation of an overall dN/dS for the whole
coding sequence is a very conservative measure of positive
selection (Yang and Bielawski 2000), estimates that account
for variation in dN/dS among sites in order to detect specific
sites that could have been fixed by positive selectionwere also
3
Molecular Evolution Constraints in FOS-GRN . doi:10.1093/molbev/mst223 MBE
 at U
N
IV
E
R
S
IT
A
T
 P
O
M
P
E
U
 F
A
B
R
A
 o
n
 Jan
u
ary
 1
2
, 2
0
1
4
h
ttp
://m
b
e.o
x
fo
rd
jo
u
rn
als.o
rg
/
D
o
w
n
lo
ad
ed
 fro
m
 
calculated. Results showed that the genes UFO, FT, and CLF
yielded a marginal significant P value when comparing the
model M8 assuming positive selection with the null model
M7 of the program CODEML (see Materials and Methods).
However, the test was no longer significant after correcting
for multiple comparisons. For all 15 genes, the models M2a
was not significantly better than the null model M1a
(supplementary table S5, Supplementary Material online).
The overall dN/dS, dN, and dS were computed for each
gene under the M0 model (table 1). The genes of the FOS-
GRN are subject to strong purifying selection with an overall
mean dN/dS of 0.124. Overall dN/dS values are plotted in
figure 3; from the 15 genes, 13 (86.66%) have a dN/dS
value <0.15.
magnoliids
a
n
g
io
s
p
e
rm
s
e
u
d
ic
o
ts
c
o
re
e
u
d
ic
o
ts
m
o
n
o
c
o
t
commelinids
malvids
campanulids
lamiids
Amborellales
Nymphaeales
Austrobaileyales
Piperales
Canellales
Magnoliales
Laurales
Chloranthales
Commelinales
Zingiberales
Poales+
Arecales
Dasypogonaceae
Asparagales
Liliales
Pandanales
Dioscoreales
Petrosaviales
Alismatales
Acorales
Ceratophyllales
Ranunculales
Sabiaceae
Proteales
Buxales
Trochodendrales
Gunnerales
Cucurbitales
Fagales
Rosales+
Fabales+
Celastrales
Oxalidales
Malpighiales+
Zygophyllales
Malvales+
Brassicales+
Huerteales
Sapindales
Picramniales
Crossosomatales
Myrtales
Geraniales
Vitales+
Saxifragales
Dilleniaceae
Berberidopsidales
Santalales
Caryophyllales
Cornales
Ericales
Garryales
Gentianales
Lamiales
Solanales
Boraginaceae
Aquifoliales
Escalloniales
Asterales
Dipsacales
Paracryphiales
Apiales
Bruniales
fabids
Species APG III
Rosales
Rosales
Carica papaya (Cpa) Brassicales
Arabidops is thaliana (Ath) Brassicales
Arabidops is lyrata (Aly) Brassicales
Manihot es culenta (Mes) Malpighiales
Glycine max (Gma) Fabales
Lotus japonicus (Lja) Fabales
Medicago truncatula (Mtr) Fabales
Populus trichocarpa (Ptr) Malpighiales
Ricinus communis (Rco) Malpighiales
Theobroma cacao (Tca) Malvales
Brachypodium distachyon (Bdi) Poales
Oryza sativa japonica (Osj) Poales
Oryza sativa indica (Osj) Poales
Sorghum bicolor (Sbi) Poales
Zeamays (Zma) Poales
Fragaria vesca (Fve)
Malus domestica (Mdo)
Vitis vinifera (Vvi) Vitales
Ge n e Me s
AG 1 1 2 1 2 2 2 1 1 2 1 1 1 2 1 1
AP1 1 1 2 1 2 1 2 2 2 2 1 2 1 2 1 1
AP2 1 1 1 1 4 1 4 2 1 1 1 2 2 1 2 2
AP3 1 1 1 1 2 1 1 2 1 1 1 1 1 1 1 1
CLF 1 1 1 1 2 1 2 1 1 1 1 2 1 1 1 1
EMF1 1 1 1* 1 1 1 3 1 2 1* 1* 2 1 1* 1 1
FT 1 1 2 1 2 1 2 1 1 3 2 3 4 2 1 1
FUL 1 1 1 1 3 1 2 2 1 1 1 1 1 1 1 1
LFY 1 1 1 3 2 1 2 2 1 1 1 1 1 1 1 1
LUG 1 1 3 2 4 1 1 1 2 1 2 2 1 2 2 1
PI 1 1 2 1 4 1 1 2 1 2 1 2 1 2 1 1
SEP 1 1 1 1 2 1 2 1 2 1 1 2 1 1 1 1
TFL1 1 1 1 1 2 1 2 3 1 2 2 2 1 2 2 1
UFO 1 1 1 2 2 1 2 1 1 1 1 2 1 1 1 1
WUS 1 1 1 3 2 2 5 3 2 1 1 4 1 1 2 3
Aly At h Bd i Fv e Gm a Lja Md o Mt r Os j Os i P t r Rc o S b i Tc a Vv i(a)
(b)
(c)
FIG. 2. Gene conservation data, species used, and their placement in Angiosperm phylogeny. (a) Conservation matrix of the genes involved in the
FOS-GRN across Angiosperm species (*Genes were identified using Oryza sativa EMF1 protein (OS01G12890) for homology search; +Families
considered in the analysis). (b) Angiosperms phylogeny APG III according to Bremer et al. (2009). (c) Species used in the analysis.
4
Davila-Velderrain et al. . doi:10.1093/molbev/mst223 MBE
 at U
N
IV
E
R
S
IT
A
T
 P
O
M
P
E
U
 F
A
B
R
A
 o
n
 Jan
u
ary
 1
2
, 2
0
1
4
h
ttp
://m
b
e.o
x
fo
rd
jo
u
rn
als.o
rg
/
D
o
w
n
lo
ad
ed
 fro
m
 
Analysis of the Classes of Genes
To test whether the measures of dN/dS, dN, or dS were sta-
tistically different between two gene classes, the ABC genes
and the additional genes in the network, a Kruskal–Wallis test
was performed. Although the genes EMF1 andWUS showed
higher dN/dS values than the 86.66% of the genes, the test
gave no significant differences in dN/dS, dN, or dS between
the classes. Means and P values are shown in supplementary
tables S6 and S7, Supplementary Material online.
Model-Based Clustering of Sequence
Conservation Features
During initial exploratory data analysis, it was observed that
protein and DNA coding sequences of some of the genes in
the FOS-GRN across the angiosperm species show interesting
patterns in measures of conservation other than the
evolutionary parameters (supplementary figs. S1–S7,
Supplementary Material online). The following conservation
features were calculated (see Materials and Methods): the
degree of variability in protein size of each protein over all
species (measured by the coefficient of variation), mean pro-
tein pairwise sequence distances, mean protein sequence dis-
tance, and mean DNA sequence distance (table 2). Given
such data, the following question raised: is there an associa-
tion between such conservation patterns and the functional
classification of the proteins in the network?
In order to explore this possibility, a model-based cluster-
ing analysis was applied. Clustering is the process of grouping
similar objects together. Here, a feature-based clustering ap-
proach was used, in which an N D feature matrix is used as
input (Murphy 2012). A feature matrix was assembled where
each of the N rows represents a particular gene and the
D columns corresponded to the conservation features listed
above, together with an additional column corresponding to
the dN/dS data (table 2). In other words, each row represents
a conservation feature vector for each gene. This analysis does
not make any assumption about the prior known functional
category of the genes. Instead, it divides the genes into clusters
according to the similarity among their feature vectors. The
analysis was restricted to include all but the EMF1 and WUS
genes: when all the genes were included, an additional cluster
was invariably obtained for each of the two genes (EMF1 and
WUS) given their high dN/dS and interspecies sequence dis-
tances (data not shown). Interestingly, the methodology un-
covered three clusters (fig. 4): one corresponding the genes
AG,AP1,AP2, and PI (circles); one for the genes FUL, LFY,UFO,
and AP3 (triangles); and the last one to the additional genes
CLF, FT, LUG, SEP, and TFL1 (squares). The four genes in the
first cluster correspond to ABC floral organ identity genes.
While the genes in the second cluster, except AP3, floral mer-
istem identity genes (Krizek and Fletcher 2005). These results
suggest an association between molecular size and sequence
conservation features, evolutionary rates, and functional
category. Those genes with a well-characterized function
Table 1. Evolutionary Parameters of the FOS-GRN Genes.
Gene Locus Protein Length Percent of Analized Codons dN dS dN/dS
AP1 AT1G69120 256 89 0.7683 6.1525 0.12487
AP2 AT4G36920 432 80 0.6095 5.6578 0.10773
AP3 AT3G54340 232 93 0.7713 8.0723 0.09555
CLF AT2G23380 902 67 0.6369 5.386 0.11824
EMF1 AT5G11530 1096 66 3.8105 9.6281 0.39577
FT AT1G65480 175 99 0.4509 5.9891 0.07529
FUL AT5G60910 242 95 0.8261 7.0456 0.11725
LFY AT5G61850 420 83 0.715 8.5201 0.08392
LUG AT4G32551 931 78 0.5995 5.2033 0.11522
PI AT5G20240 208 58 0.602 10.1413 0.05936
SEP AT1G24260 250 90 0.6172 7.9816 0.07733
TFL1 AT5G03840 177 97 0.5 5.973 0.0837
UFO AT1G30950 442 84 0.9109 8.4945 0.10723
WUS AT2G17950 292 51 2.615 12.799 0.20431
FIG. 3. Calculated dN/dS values sorted in increasing order. The horizon-
tal dotted line is plotted to show that, from the 15 genes, 13 (86.66%)
have a dN/dS value<0.15. Plotted values were calculated using the M0
model.
5
Molecular Evolution Constraints in FOS-GRN . doi:10.1093/molbev/mst223 MBE
 at U
N
IV
E
R
S
IT
A
T
 P
O
M
P
E
U
 F
A
B
R
A
 o
n
 Jan
u
ary
 1
2
, 2
0
1
4
h
ttp
://m
b
e.o
x
fo
rd
jo
u
rn
als.o
rg
/
D
o
w
n
lo
ad
ed
 fro
m
 
(e.g., a direct involvement in the processes of floral or meri-
stem identity) share more similar conservation features
among them than with the additional interacting genes
which are known to be involved in various processes. Genes
in the last cluster are known to integrate the flowering
process with upstream signaling mechanisms and either pro-
mote (e.g., FT) or inhibit (TFL1) flower organ development.
Figure 4b shows a two-dimensional projection of the feature
vector alongwith the corresponding classification boundaries;
it is interesting to note that the boundaries between the
meristem (triangles) and flower identity (circles) clusters
merge and are clearly separated from the third cluster
(squares). This is consistent with the known biological
mechanisms where genes such as AP1 participate as both
meristem and floral organ identity genes.
Given that clustering is an unsupervised learning tech-
nique, it is hard to evaluate the quality of the output on
any given method. One way to do so is to rely on some
external form of data with which to validate the method. In
the case in point, labels representing functional categories can
be assigned to each gene. Each gene was labeled with one of
the three categories: floral organ identity, floral meristem
identity, and other. The clustering was then compared with
the labels using a standard metric: the Rand index (see
Materials and Methods). This metric was calculated for the
output of the clustering. Then, its statistical significance was
(a) (b)
FIG. 4. Output from the model-based clustering analysis. (a) Scatterplot matrix for conservation features with points (genes) marked according to the
corresponding cluster; the ellipses shown are the multivariate analogs of the standard deviations for each mixture component. (b) Data projection on a
dimension reduced subspace. Clustering structure and boundaries are shown; genes are marked according to the corresponding cluster.
Table 2. Gene Conservation Features.
Gene No. Protein Sequences Protein Size (CV) Protein Mean Distance DNA Codon Mean Distance dN/dS
AG 30 0.241130802 0.437267 0.4316755 0.09981
AP1 30 0.210350092 0.4575694 0.4514316 0.12487
AP2 30 0.095096262 0.4422398 0.4184613 0.10773
AP3 22 0.099160315 0.4942555 0.4455772 0.09555
CLF 22 0.354842355 0.4084511 0.3709341 0.11824
EMF1 28 0.323552388 1.742475 1.2504943 0.39577
FT 31 0.254844679 0.2390374 0.3548233 0.07529
FUL 22 0.18187526 0.464033 0.4358368 0.11725
LFY 25 0.182826509 0.4513709 0.4390543 0.08392
LUG 37 0.236524789 0.3276286 0.3408934 0.11522
PI 29 0.184321532 0.4537865 0.3875463 0.05936
SEP 25 0.19888967 0.3333186 0.3837757 0.07733
TFL1 27 0.009807079 0.2475236 0.324812 0.0837
UFO 23 0.037201769 0.6089123 0.5777019 0.10723
WUS 36 0.17482874 1.1723443 0.8080616 0.20431
6
Davila-Velderrain et al. . doi:10.1093/molbev/mst223 MBE
 at U
N
IV
E
R
S
IT
A
T
 P
O
M
P
E
U
 F
A
B
R
A
 o
n
 Jan
u
ary
 1
2
, 2
0
1
4
h
ttp
://m
b
e.o
x
fo
rd
jo
u
rn
als.o
rg
/
D
o
w
n
lo
ad
ed
 fro
m
 
assessed through their frequency sampling distribution com-
puted using a bootstrap resampling method (Murphy 2012).
The observed clustering decisions are highly significant
(P value= 0.0002). Thus, there is statistical support for an
association between the molecular conservation features,
the evolutionary rates, and the functional category of the
genes in the FOS-GRN.
The Strength of the Purifying Selection and
Network Structure
Each node in the network was characterized by a set of
features including the molecular evolutionary parameters
(dN, dS, and dN/dS) and its placement within the network
topology, using measures such as centrality, degree, closeness,
betweenness, and eccentricity (see Materials and Methods).
GRNs contain directed interactions with either an inhibitory
or an activating character. Given that the dynamical behavior
of GRNs is associated with the type of interactions within
the network, the topological network properties, out-
degree, in-degree, activating in-degree, and inhibitory in-de-
gree, were also included as features (supplementary table S8,
Supplementary Material online). Once the evolutionary
parameters and the network topological features were calcu-
lated, the goal was to answer the following questions: 1) Is
there a relationship between the evolutionary parameters and
the network nodes topological location within the FOS-GRN?
2) How strong is the relationship found, if any? 3) Which
network topological features contribute the most to evolu-
tionary rates?
A relationship between each of the evolutionary parame-
ters and each of the node’s topological features within the
FOS-GRN was tested. Assuming an approximately linear re-
lationship, model coefficients were estimated independently
for each of the networks’ topological features as single pre-
dictor variables of the evolutionary parameters. Hypothesis
tests on the coefficients were performed in order to test
whether or not there is a relationship between the variables
in each case. Mathematically, this corresponds to testing
whether the corresponding coefficient is equal to 0 or not.
Details of the least squares models for the regression of dN/dS
on each of the topological features used are provided in
supplementary table S10, Supplementary Material online.
Interestingly, the null hypothesis that the coefficient is
equal to 0 could not be rejected for any case; consequently,
a relationship between the dN/dS and any of the networks
topological features tested could not be declared to exist,
given the available data. The same analysis was applied indi-
vidually to dN and dS as response variables. Only a marginal
significant relationship (P value ~0.05) was found between dS
and closeness. In a preliminary analysis, Spearman’s rank cor-
relation coefficients between the evolutionary parameters
and the topological network properties were also calcu-
lated and are reported in supplementary table S9,
Supplementary Material online. No significant correlation
was found between the measures of centrality and the
evolutionary estimates.
Similarity in Evolutionary Parameters of
Interacting Genes
It has been suggested that interacting elements within a net-
work share more similar values of evolutionary parameters
within themselves than with noninteracting components
(Alvarez-Ponce et al. 2009). In order to test whether this pat-
tern is present in the FOS-GRN, two different approaches
were applied: 1) the average absolute difference (AAD) of
the value of the evolutionary parameters between interacting
components in the networks was used as an statistic and
compared with its null distribution in an ensemble of similar
but random networks (Alvarez-Ponce et al. 2009), and 2) a
matrix of pairwise shortest path distances between the genes
in the network was compared with the matrices of pairwise
absolute differences in evolutionary parameters (Montanucci
et al. 2011). Using the former approach, an AAD of dN/dS of
0.0567 was calculated for the FOS-GRN. The histogram of the
corresponding statistic on an ensemble of 100,000 random
networks with the same number of nodes and interactions is
shown in supplementary figure S8, Supplementary Material
online. The simulated data follow closely a Gaussian distribu-
tion. The obtained data were used to estimate the probability
of observing such a small value. Two approaches were fol-
lowed: 1) calculating the fraction of random networks show-
ing an AAD value 0.0567 and 2) calculating the probability
of such a value using a Gaussian density function with an
empirically estimated mean and standard deviation (supple-
mentary fig. S8, SupplementaryMaterial online). The resulting
probabilities were 0.12768 and 0.12852, respectively.
For the second approach, a Mantel test comparing a
matrix of pairwise distances between genes in the network
and matrices of pairwise absolute differences in evolutionary
parameters was applied for dN/dS, dN, and dS. The test found
no significant correlation between distance and difference
in any evolutionary parameter (supplementary table S11,
Supplementary Material online). The results of both
approaches do not support the hypothesis that neighboring
genes share similar evolutionary constraints in the case of the
FOS-GRN.
Discussion
The question of whether the role of regulators involved in the
control of floral initiation is conserved across flowering plants
has been raised recently in the literature (Wellmer and
Riechmann 2010). Of particular interest is the situation of
grass-like plants and other monocots, which are distantly
related to A. thaliana and its relatives. Based on the identifi-
cation of homologs of the main regulators involved in the
control of floral initiation ofA. thaliana inmonocots as well as
observations of expression patterns in different species, it has
been suggested that many aspects of the topology of the
floral transition network seem to be conserved between
dicots and monocots (Wellmer and Riechmann 2010).
However, empirical gene conservation data based on
whole-genome analysis were lacking. Given the availability
of multiple genomes of angiosperms—both monocots and
dicots—a comparative genomics approach was possible and
7
Molecular Evolution Constraints in FOS-GRN . doi:10.1093/molbev/mst223 MBE
 at U
N
IV
E
R
S
IT
A
T
 P
O
M
P
E
U
 F
A
B
R
A
 o
n
 Jan
u
ary
 1
2
, 2
0
1
4
h
ttp
://m
b
e.o
x
fo
rd
jo
u
rn
als.o
rg
/
D
o
w
n
lo
ad
ed
 fro
m
 
enabled us to uncover a clearer picture of the conservation
status of the regulators known to be involved in the control of
floral initiation and floral organ specification in A. thaliana
across angiosperms. We focused specifically on the regulators
participating in the FOS-GRN model (see Espinosa-Soto et al.
2004; Alvarez-Buylla et al. 2010 for updates). Our results show
that all the FOS-GRN genes have representatives in the 18
angiosperm species used in this study. The existence of all the
genes in all the surveyed species, together with the high
selective constraint level found in this study (mean
dN/dS= 0.124), suggests that the FOS-GRN is functionally
constrained across all these species belonging to nine families,
nine orders, and both monocot and dicot species. This is
consistent with what we might expect given the robustness
of the FOS-GRN as a developmental regulatory module and
the observed expression patterns of some of the genes of this
GRN documented for different species (Espinosa-Soto et al.
2004; Alvarez-Buylla et al 2010). These results, however, do
not provide information of whether or not there are consid-
erable differences in network circuitry among species. The
empirical data obtained here may serve, nonetheless, as a
basis to explore the dynamical behavior of the corresponding
FOS-GRN in different species under the assumption of con-
served interactions among network components. Indeed, fur-
ther model refinements as well as phenotypic validations and
testable predictions could be generated following such a
theoretical approach.
Our results also show that the genes in the FOS-GRN have
undergone a number of duplication and/or loss events.
The evolutionary history of MADS-box genes involved in
flowering has been extensively studied with phylogenetic
approaches (see, e.g., Alvarez-Buylla et al. 2000). A complex
history of gene duplications within the AP1/FUL clade during
angiosperm evolution is well documented (Preston and
Kellogg 2007). The results of gene conservation obtained in
this work suggest a similar complex history for most of
the genes of the other gene families in the FOS-GRN.
Furthermore, it is well known that some of the species in-
cluded in the study have shared whole-genome duplication
(WGD) events. For example, A. thaliana has experienced at
least three WGD events—two recent events since its diver-
gence from other members of the Brassicales clade and a
more ancient event shared with most, if not all, eudicots
(Bowers et al. 2003). A WGD event occurred more than
once before the split between A. thaliana and A. arenosa
(Ha et al. 2009); consequently, the two Arabidopsis species
included in the analysis have shared WGD events which are
not shared with the other species. This evolutionary scenario
may partially account for the complex pattern of duplications
observed in the conservation data; unfortunately, it also
makes it difficult to establish clear relationships of orthology.
The empirical conservation data reported herein thus serve as
a basis for further phylogenetic studies which are needed in
order to better explain the processes leading to the conser-
vation and expansion of the FOS-GRN across angiosperms.
The data concerning the overall conservation of the FOS-
GRN genes obtained here suggest interesting questions for
future investigation in diverse angiosperm species, such as
addressing whether the interactions of the flower organ iden-
tity genes and their interacting partners are conserved among
monocots and dicots or not. What is the role of the dupli-
cated genes in the dynamics of the FOS-GRN? Does such gene
redundancy increase the robustness of the process at the level
of the GRN dynamics? These and similar questions can be
explored starting from the conservation data reported here
and following a combination of theoretical and experimental
approaches. A first approach to the role of duplications in the
FOS-GRN can be found in Espinosa-Soto et al. (2004) for the
case of the B-function genes in Petunia. This study showed
that the FOS-GRN is dynamically robust to duplications.
In a study based on a comparative genomics approach, the
quality of genome annotation is of major concern. The fact
that putative annotation errors were detected recurrently in
the same species gives support to the curational process
followed, but it also suggests the need of more careful anno-
tations in the genomes of L. japonicus, O. sativa indica,
P. trichocarpa, M. esculenta, and R. communis. Future im-
provements in annotation quality may help the curational
process in gene network conservation studies. Here, we report
the conservation data for the FOS-GRN both before (supple-
mentary table S5, Supplementary Material online) and after
manual curation (fig. 2).
Selective Constraints in the FOS-GRN
It has been suggested that additional plant species, other than
the experimental model species, should be included in mo-
lecular evolutionary studies to completely appreciate the con-
servation and evolvability of the regulatory network for flower
development (Yang et al. 2011). Here, we show that thewhole
GRN controlling cell specification during early stages of flower
development, when primordial floral organ cells are specified,
has evolved under purifying selection. Unlike previous studies,
we considered a wider range of angiosperms including both
monocot and dicot species. Our results agree with previous
conclusions: floral organ identity genes evolved under strong
purifying selection. The evolution of the genes considered in
the FOS-GRN is functionally constrained, as evidenced by the
dN/dS ratios. We calculated an overall mean dN/dS of 0.124.
From the 15 genes, 13 (86.66%) have a dN/dS value <0.15.
Yang et al. recently reported the molecular evolutionary anal-
ysis of a group of 58 genes involved in flower development
that includes all the genes that were analyzed in the present
work, with the exception of EMF1 (Yang et al. 2011). Their
analysis included only the species A. thaliana and A. lyrata. In
their study, the authors report an average dN/dS value of 0.17
and interpret this result as evidence suggesting that these
genes have overall evolved under purifying selection. On
the other hand, the smaller average dN/dS value that we
calculated for the 15 genes of the FOS-GRN (0.124) is based
on amuchwider range of species; and some of them aremore
distantly related than the two compared in the study of Yang
et al. Furthermore, the calculated average value is highly influ-
enced by the high dN/dS value corresponding to EMF1
(0.39577). If we omit EMF1 in the calculation, the average
dN/dS is 0.1049864. This observation supports the conclusion
8
Davila-Velderrain et al. . doi:10.1093/molbev/mst223 MBE
 at U
N
IV
E
R
S
IT
A
T
 P
O
M
P
E
U
 F
A
B
R
A
 o
n
 Jan
u
ary
 1
2
, 2
0
1
4
h
ttp
://m
b
e.o
x
fo
rd
jo
u
rn
als.o
rg
/
D
o
w
n
lo
ad
ed
 fro
m
 
that the calculated dN/dS values are small and suggest that
the FOS-GRN is functionally constrained. In order to find
further support for our interpretation, we analyzed the dN
and dS values previously reported for the whole-genome set
of orthologous between A. thaliana and A. lyrata and calcu-
lated the average dN/dS value over the complete data set (see
Materials andMethods). The calculated average dN/dS is 0.29;
the complete empirical distribution is shown in supplemen-
tary figure S9a, Supplementary Material online. Using this
whole-genome data set, we conducted a resampling experi-
ment in order to calculate the likelihood of observing an
average dN/dS value over a group of 15 genes equal or smaller
to the one we report (0.124). The fraction of values from this
distribution with a value equal or less than 0.124 was 0.00038.
Hence, the encountered small value could be found in a
random sample of the same size with a very small probability
(P value= 0.00038). Supplementary figure S9b,
SupplementaryMaterial online, shows the distribution of sim-
ulated average dN/dS values. Considering that our dN/dS
calculations are based on a set including more distant species,
this empirical evidence strongly supports our claim that the
reported average dN/dS of 0.124 is small.
When testing for evidence of positive selection as a force
which could have fixed specific sites, using models that
account for site class variability in dN/dS, we found that
sites with a dN/dS >1 may exist only in UFO, FT, and CLF
as evidenced by a marginal significant P value (before con-
trolling for multiple tests) when comparingmodelM8 assum-
ing positive selection with the null model M7 (see Materials
and Methods). On the other hand, no single site in these
proteins showed a high posterior probability when the
Bayes’ theorem was applied in order to identify potential
targets of diversifying selection. Thus, in this study,
both global and site varying models failed to detect any
signature of positive selection for any codon of the FOS-
GRN genes.
Unlike the above results, previous studies have found
evidence of adaptive evolution acting at particular sites in
some of the genes included in the FOS-GRN. Olsen et al.
found evidence that suggests an adaptive mechanism
behind the patterns of variation found on TFL1 and LFY.
These and similar studies (see, e.g., Olsen et al. 2002; Moore
et al. 2005) are, in contrast to the present study, based on
population genetic tests and data. Hence, these studies have
captured the patterns of variation in these genes resulting
from recent divergent evolution. Future studies should fur-
ther investigate the microevolutionary process at play among
the FOS-GRN genes. Some evidence at hand suggests that
even for more recent divergences, floral organ identity genes
will show evidence of strong purifying selection (Yang et al.
2011), but other flower transition genes seem to have been
prone to positive selection as well (Martı́nez-Castilla and
Alvarez-Buylla 2003); however, both selective forces are not
mutually exclusive in any given gene.
Martı́nez-Castilla and Alvarez-Buylla (2003) focused on the
Arabidopsis MADS-box gene family and found several sites
within the MADS and K boxes, with high probabilities of
having been fixed under positive selection, suggesting that
these boxes may have played important roles in the acquisi-
tion of novel functions during recent events of MADS-box
diversification. Here, through the analysis of alignments con-
structed on the basis of 1–1 orthologous relationships for
distantly related angiosperm species, we did not find evidence
of positive selection on such sites. Our result suggest that
although adaptive evolution probably plays an important
role during recent diversification events of the MADS-box
gene family, a constrained evolution have prevailed upon
the functionally established orthologous members across
species which diverged more years ago. The question of
whether or not the MADS-box gene family shows similar
signs of adaptive evolution in species other than A. thaliana
is open. This question, and its relevance for the phenotypic
evolution of plants, is interesting given the complexity of the
duplication events that have shaped the MADS-box gene
family in angiosperms, as evidenced by the presence of mul-
tiple copies of flowering MADS-box genes found in several
angiosperm species.
Selective Constraints and Functional Categories
Previous studies on floral genes in different populations of
A. thaliana or different Arabidopsis species have also shown
that floral organ identity genes evolved under strong purifying
selection, but some flowering-time genes experienced rela-
tively relaxed purifying selection and positive selection
(Olsen et al. 2002; Moore et al. 2005). It has been suggested
that selective constraints acting on genes of the same family
are closely associated with their functions (Yang et al. 2011).
The FOS-GRN includes genes which have been shown to be
functionally associated with the promotion of flower meri-
stem identity (LFY, AP1, UFO) or with floral organ identity
(the ABC genes AP1, AP2, AP3, PI, AG). For historical and
empirical reasons, the ABC genes have been qualified as
having a prominent role in the process of cell fate and
organ type specification during early flower development.
Given this background information, the presence of a stron-
ger functional constraint upon such genes in relation with the
other interacting genes would be a reasonable hypothesis.
Our results show that there is no significant difference
between the molecular evolutionary parameters of these
genes and the other genes in the FOS-GRN (supplementary
table S7, Supplementary Material online), however. This sug-
gests that the ABC genes have not been subject to a stronger
functional constraint than the rest of the FOS-GRN genes, at
least as evidenced by the differential rate of evolution analyses
that we performed in this study. Instead, it seems that it is
the whole regulatory module which is under a strong evolu-
tionary constraint.
In contrast to the previous result, when molecular size and
sequence conservation features were considered in addition
to the dN/dS, it was possible to cluster the proteins into
groups consistent with their functional roles. Specifically, an
unsupervised model-based clustering analysis grouped the
FOS-GRN proteins into three clusters consistent with their
associated functions during inflorescence and flower devel-
opment; and this consistency was assessed statistically
9
Molecular Evolution Constraints in FOS-GRN . doi:10.1093/molbev/mst223 MBE
 at U
N
IV
E
R
S
IT
A
T
 P
O
M
P
E
U
 F
A
B
R
A
 o
n
 Jan
u
ary
 1
2
, 2
0
1
4
h
ttp
://m
b
e.o
x
fo
rd
jo
u
rn
als.o
rg
/
D
o
w
n
lo
ad
ed
 fro
m
 
(see Results). Our results show that meristem and flower
identity genes share similar molecular conservation features
among them, whereas these are quite different from those
observed in genes known to be involved in several other
mechanisms with no apparent single prominent function.
We interpret these results as evidence suggesting a constraint
associated with the functional role of the genes. Although it is
complicated to define rigorously a specific function for the
individual components of complexmolecular systems such as
GRNs, given that no gene acts independently of their inter-
acting partners or in a context-specific manner, our multivar-
iate clustering approach uncovered a nontrivial pattern.
Without any prior assumption about differences among the
proteins, the methodology separated the genes in groups in a
way consistent with the empirically known functions.
Furthermore, the classification boundaries separating the
clusters only merge in the case of the two groups in which
some of their components are known to be associated with
both functions (e.g., AP1 is both a meristem and floral organ
identity protein). Interestingly, it is only possible to uncover
such a pattern when conservation measures other than evo-
lutionary rates or sequence similarity were considered. The
degree of conservation in sequence length seems to be rele-
vant and closely associated with the molecular function.
Finally, it is worth mentioning that the uncovered pattern
is only obtained when considering several conservation fea-
tures and not just a single evolutionary parameter or similarity
measure.
Molecular Evolutionary Parameters and
Network Architecture
Previous studies have suggested several approaches to test
whether there is a relationship between network architecture
and the molecular evolutionary parameters of the network’s
components (dN, dS, dN/dS): 1) the calculation of correlation
coefficients between network topological measures of cen-
trality and molecular evolutionary parameters (Montanucci
et al. 2011), 2) the calculation of whether interacting nodes
within a network havemore similar values of the evolutionary
parameters than noninteracting nodes (Alvarez-Ponce et al.
2009), and 3) the comparison of a matrix of pairwise shortest
path distances between genes in the network and matrices of
pairwise absolute differences in evolutionary parameters
(Montanucci et al. 2011). Here, the three approaches were
applied to the FOS-GRN, in addition to a regression-based
modeling approach. Most of the above approaches assume
that the architecture or topology of the network affects the
molecular evolution of its nodes, and they implicitly assume
then that such static network structure somehow is corre-
lated to dynamical or functional modularity. Unlike previous
network-level molecular evolutionary studies, we did not find
a significant relationship between network architecture and
the evolutionary parameters: 1) no significant correlation was
found between the evolutionary parameters and the mea-
sures of centrality of the nodes, 2) analyses did not support
the hypothesis that neighboring genes in the network share
similar evolutionary constraints, and 3) regression coefficients
did not support a relationship between the molecular evolu-
tionary parameters and any of the nodes’ topological features
tested. This result suggests that the proteins of the FOS-GRN,
although subject to purifying evolutionary forces, do not
show any discernible pattern of association between the
strength of constraint and the local structural properties
within the network. This implies that the whole module is
subject to similar molecular evolutionary constraints and/or
the structural considerations do not have a functional or
dynamical relevance that might have been important for
the evolutionary constraints experienced by different nodes
within the FOS-GRN. These results should be interpretedwith
caution, however, because of the small sample size. Statistical
analysis has two goals that directly conflict. First is to find
patterns in data. The second goal is a fight against apophenia,
the human tendency to invent patterns in random data
(Klemens 2008). In the context of GRNs, care should be
taken when testing for the existence of relationships (or
lack thereof) between node features and evolutionary pat-
terns based on statistical analysis. The identification of “real
patterns” could be limited by the size of the data set analyzed.
Nonetheless, it is noteworthy that previous studies for small
pathways/networks with a similar number of nodes as in the
GRN analyzed here (20 nodes) have found significant
trends between topological and evolutionary parameters
(see, e.g., Alvarez-Ponce et al. 2009; Fitzpatrick and
O’Halloran 2012).
Given that we did find an association between conserva-
tion features of the genes—including evolutionary rates—and
their functional role during flower development, and consid-
ering that the role of specific genes in the specification of
meristem and floral identity has been probed during the
analysis of the FOS-GRN as a dynamical system (Espinosa-
Soto et al. 2004), we speculate that functional (dynamical),
instead of topological, network properties, such as those
associated with robustness, could be significantly associated
with the molecular evolutionary constraints of the genes in
the FOS-GRN reported here.
Overall, our results depict a general picture of the evolu-
tionary pattern of the FOS-GRN where functional constraint
better explains the evolution of its genes. The approach fol-
lowed here provided new data relevant for the study of the
evolution of the mechanisms at the molecular level that are
behind organ identity during early flower development.
Specifically, we have shown that 1) the FOS-GRN genes are
conserved among 18 Angiosperm species; 2) a complex his-
tory of gene duplications seems to have been involved in the
expansion of the network across angiosperms; 3) the whole
FOS-GRN has evolved under purifying selection; 4) ABC floral
organ identity genes do not show a significantly stronger
evolutionary constraint than the other genes in the FOS-
GRN; 5) an association between protein length and sequence
conservation features, evolutionary rates, and functional cat-
egory seems to prevail among the genes in the FOS-GRN;
and 6) the FOS-GRN does not show any significant relation-
ship between network architecture and the evolutionary
parameters of its genes.
10
Davila-Velderrain et al. . doi:10.1093/molbev/mst223 MBE
 at U
N
IV
E
R
S
IT
A
T
 P
O
M
P
E
U
 F
A
B
R
A
 o
n
 Jan
u
ary
 1
2
, 2
0
1
4
h
ttp
://m
b
e.o
x
fo
rd
jo
u
rn
als.o
rg
/
D
o
w
n
lo
ad
ed
 fro
m
 
Materials and Methods
Sequence Data
The FOS-GRN described in Espinosa-Soto et al. (2004) and
updated in Alvarez-Buylla et al. (2010) was used as study
system; the corresponding genes are reported in supplemen-
tary table S1, Supplementary Material online. The identifiers
of the genes involved in this network were obtained from
the TAIR database (http://www.arabidopsis.org, last accessed
November 24, 2013) and integrated into the workbench tool
of the Plaza Comparative Genomics Platform (http://bioin-
formatics.psb.ugent.be/plaza/, last accessed November 24,
2013) (Proost et al. 2009).
After applying the PLAZA integrative method of ortholo-
gous genes finding (discussed later), both the sequence data
of the genes of A. thaliana and the sequence data of the
corresponding homologous genes were retrieved using the
export functionality of the PLAZA’S workbench tool. This
first data set corresponds to the FOS-GRN preliminary gene
conservation set which includes those species with a se-
quenced and annotated genome and is represented as a con-
servationmatrix and a list of corresponding gene identifiers in
supplementary tables S2 and S3, Supplementary Material
online, respectively. In order to reduce the probability of
reporting the conservation of nonfunctional proteins, the
preliminary data set was manually curated. For this purpose,
erroneous automatic orthology designations were discarded,
and those groups of adjacent gene annotations actually cor-
responding to different regions of a single gene were merged
(discussed later). The final and corrected conservation data of
the FOS-GRN proteins across angiosperms are reported in the
form of a conservation matrix (fig. 1a) and its corresponding
list of gene IDs (supplementary table S4, Supplementary
Material online).
Homology Search
The PLAZA Comparative Genomics Platform offers an access
point for plant comparative genomics centralizing genomic
data produced by different genome sequencing initiatives
(Proost et al. 2009). The PLAZA integrative method of
orthologous genes integrates a complementary set of data
types and methodologies in order to infer orthologous
gene relationships based on the following sources of
evidence: Orthologous gene families (ORTHO) inferred
using OrthoMCL, Tree-based orthologs (TROG) inferred
using tree reconciliation of the phylogenetic tree of a gene
family, Best-Hits-and-Inparalogs (BHI) inferred from Blast hits
against the PLAZA protein database, and Anchor points refer
to gene-based colinearity between species. Using this tool,
different homology relationship types can be considered:
when a gene has no paralogs and only 1 ortholog (1–1),
when a gene has 1 or more paralogs and only 1 ortholog
(N–1), and the corresponding combinations for a total of
four different orthology relationship: 1–1, N–1, 1–N, and
M–N. In this work, the PLAZA integrative method was
used to infer homology gene relationships for each protein
in the FOS-GRN. The following settings were used: all
orthologous relationship types were allowed, all evidence
types were taken into account, and 18 plant species corre-
sponding to the Phylum Angiospermae were included (see
Results).
Manual Curation of Putative In-paralogs
As the degree and quality of annotation of whole-genome
projects varies considerably among species, it is not adequate
to rely only on automatic procedures, and instead, careful
data set cleaning is necessary. Further manual curation to
the reported gene groups after a homology analysis should
be considered in order to reduce the likelihood of including
nonfunctional proteins in other analyses. For each gene in the
preliminary conservation data list (supplementary table S3,
Supplementary Material online), the following information
was extracted from PLAZA Comparative Genomics
Platform: CDS sequence, protein sequence, chromosome,
location (e.g., start, stop), length, and InterPro annotated pro-
tein domains. Given these data, some putative in-paralog
genes were manually eliminated from the preliminary conser-
vation data. On the other hand, the homology status of some
genes was updated based on one or more of the following
criteria: partial proteins (small size), lack of any of the protein
domains of the orthologous gene in A. thaliana, neighboring
genomic location, or low sequence alignment quality. The
preliminary status of certain genes in the conservation data
as multiple single paralogous copies in the same genome was
modified to single copy orthologous genes, once it was real-
ized that in many cases different boxes of the same open
reading frame were sometimes annotated as different
genes. Details of the manual curation process and sequence
selection criteria are described in the supplementary text,
Supplementary Material online.
Multiple Alignments and Phylogenetic Inference
All protein multiple sequence alignments (MSAs) were gen-
erated using the software CLUSTALW version 2.1 (Larkin et al.
2007). The software PAL2NAL (Suyama et al. 2006) was used
to generate multiple codon alignments from the correspond-
ing aligned protein sequences and the corresponding DNA
coding sequences. For each orthologous group, a maximum
likelihood phylogeny estimation was conducted using the
software Phylm (Guindon and Gascuel 2003; Guindon et al.
2010) applying the nucleotide substitution model that best
fits the data according to the Akaike information criterion.
Details of the selected substitution models are provided on
supplementary table S12, Supplementary Material online.
Both phylogeny estimation and substitution model selection
were conducted using the function phymltest of the package
ape (Paradis et al. 2004) in the R statistical programming
environment (www.R-project.org, last accessed November
24, 2013) as described in Paradis (2012).
Analysis of the Evolutionary Rates
The evolutionary parameters dN, dS, and dN/dS were esti-
mated following a maximum likelihood procedure as imple-
mented in the software codeml of the PAML package version
11
Molecular Evolution Constraints in FOS-GRN . doi:10.1093/molbev/mst223 MBE
 at U
N
IV
E
R
S
IT
A
T
 P
O
M
P
E
U
 F
A
B
R
A
 o
n
 Jan
u
ary
 1
2
, 2
0
1
4
h
ttp
://m
b
e.o
x
fo
rd
jo
u
rn
als.o
rg
/
D
o
w
n
lo
ad
ed
 fro
m
 
4.5 (Yang 2007). Due to the broad range of species considered
for the conservation study, it was not possible to obtain re-
liable alignments for all 18 species for molecular evolutionary
analysis. This analysis was then restricted to a representative
group of species (A. lyrata, A. thaliana, B. distachyon, G. max,
M. esculenta, O. sativa, S. bicolor, T. cacao, and Z. mays) to
avoid bias in the dN/dS values—this decision was based on
the manual inspection of the resulting alignments. All align-
ments are publicly available upon request. Only MSAs based
on (putative) 1:1 ortholog sets were used. In the cases in
which there weremore than one gene copy in a given species,
the gene with the most complete sequence or the one with-
out any homogenization features (stop codons or frameshift
mutations) was used. For each codon alignment, two tests of
positive selection were performed. In order to test whether
the assumption of positive selection fits better the data than
the assumption of nearly neutral evolution, the model M2a
was compared against the null model M1a through a likeli-
hood ratio test (LRT). In a second test of positive selection, the
models M7 (null model of neutral evolution), which assumes
that dN/dS follows a (discrete) beta distribution among sites
and M8 (positive selection model), which adds a class of dN/
dS which can be greater than 1, were compared through an
LRT. The false discovery rate and Bonferroni corrections for
the multiple tests of positive selection were conducted using
the function p.adjust of the stats package in the R statistical
programming environment. In all the analyses, the F34
codon frequency model was used. Details of the LRT for
each comparison are provided in supplementary table S5,
Supplementary Material online. The strength of purifying se-
lection was measured using the dN/dS values computed
through the M0 model, which calculates rates encompassing
all the branches of the tree and for the entire length of the
sequence.
The dN and dS values reported for the whole-genome
orthologous pairs between A. thaliana and A. lyrata were
downloaded from the Ensemble Plant website (http://
plants.ensembl.org/index.html, last accessed November 24,
2013) using the BioMart platform for data retrieval. The cor-
responding dN/dS values and their statistics were calculated
over the complete data set, omitting missing data (a total of
22,531 values). The empirical distribution is shown in supple-
mentary figure S9a, Supplementary Material online. A resam-
pling experiment was conducted using the complete set of
dN/dS values as follows: a large number of gene groups of size
15 (100,000) were randomly generated, the dN/dS average
value was calculated for each group, and the distribution
obtained values was used to estimate the likelihood of ob-
serving an average value equal or smaller than the one calcu-
lated for the FOS-GRN (0.124). The simulated distribution is
shown in supplementary figure S9b, Supplementary Material
online.
Gene Conservation Features
The pairwise distances from protein MSAs were calculated
using the function dist.ml of the package phangorn (Schliep
2011) with the default parameters. In the case of DNA codon
MSAs, a matrix of pairwise distances was computed using the
dist.dna function of the package ape (Paradis et al. 2004) with
the default parameters. To obtain a final scalar conservation
feature, the correspondingmeans were calculated and used as
a summary statistics. The coefficient of variation in protein
size of each protein over all species was calculated as a mea-
sure of the degree of conservation (variation) in molecular
size. All the calculations discussed in this section were con-
ducted using the R statistical programming environment.
Genes Clustering and Function
Hypothesis tests of statistical difference of the evolutionary
parameters between the ABC floral organ identity genes and
the other genes in the FOS-GRN were conducted following a
nonparametric method (Kruskal-Wallis test). A model-based
clustering analysis was conducted using the molecular and
sequence conservation features in table 2 (last four columns)
as an input feature matrix. Intuitively, the goal of clustering is
to assign points that are similar to the same cluster and to
ensure that points that are dissimilar are in different clusters.
The analysis was conducted as implemented in the function
Mclust of the mclust package version 4.1 (Fraley et al. 2012).
This procedure fits a Gaussian finite mixture model to the
data through an EM algorithm. The best model is selected
according to the Bayesian information criterion. The cluster-
ing procedure was evaluated using the functional categories
of the genes as an external form of data for validation. The
clustering was then compared with the labels using as sum-
mary statistic the Rand index, which measures the fraction of
clustering decisions that are correct (Murphy 2012). The Rand
index was calculated using the function cluster_similarity of
the package clusteval (http://cran.r-project.org/web/pack-
ages/clusteval/, last accessed November 24, 2013). In order
to assess the statistical significance of the clustering, the fre-
quentist sampling distribution of a standard summary statis-
tic that quantifies the fraction of clustering decision that are
correct was computed using a bootstrap method. The Rand
index was used as a summary statistic (Murphy 2012).
Specifically, a character vector corresponding to the clustering
output was permuted a large number of times (n=1,000,000)
and compared each time with the labels vector using the
Rand index. The obtained sampling distribution was used
to calculate the probability of observing a Rand index value
equal or greater than the one observed when comparing the
original output of the clustering analysis with the labels
vector. Both model-based clustering analysis and clustering
evaluation were conducted in the R statistical programming
environment.
Evolutionary Rates and Network Architecture
The measures of centrality describe numerically the topolog-
ical importance of a node in a graph, given its structure.
For each gene (node) in the FOS-GRN, the followingmeasures
of centrality were calculated: degree (number of nodes it is
connected to), closeness (reciprocal of the average distance to
all other nodes), betweenness (fraction of all shortest paths
that pass through it), and eccentricity (maximum distance
12
Davila-Velderrain et al. . doi:10.1093/molbev/mst223 MBE
 at U
N
IV
E
R
S
IT
A
T
 P
O
M
P
E
U
 F
A
B
R
A
 o
n
 Jan
u
ary
 1
2
, 2
0
1
4
h
ttp
://m
b
e.o
x
fo
rd
jo
u
rn
als.o
rg
/
D
o
w
n
lo
ad
ed
 fro
m
 
from it to all other nodes). All network topological compu-
tations were conducted using the igraph package (Csardi and
Nepusz 2006). Two analyses were conducted in order to test
for the association of the evolutionary parameters of the
genes and their topological features within the network. 1)
Spearman correlation coefficients were calculated between
each evolutionary parameter given by the model M0 (dN,
dS, and dN/dS) and each topological features. 2) Simple
linear regression models were fitted using each evolutionary
parameter as response variable and each topological feature
as predictor.
It was also investigated whether genes that are interacting
in the FOS-GRN have related values of the evolutionary pa-
rameters. For this purpose, two additional analyses were con-
ducted. In the first analysis, following Alvarez-Ponce et al.
(2009), the average absolute difference of the value of the
evolutionary parameters between interacting components
in the network was calculated and then used as an statistic
in a simulation (sampling) procedure in order to assess how
frequently it is expected to observe this or a smaller value in
an ensemble of similar but random networks. Specifically,
100,000 networks each with the same number of nodes
and interactions were generated, and the statistic was calcu-
lated for each of these networks. The estimated distribution
of the statistic over the ensemble of networks was then used
to calculate the probability of observing a value equal or
smaller than that calculated in the FOS-GRN. A Gaussian
density function with parameters estimated from the data
(mean= 0.0713 and standard deviation= 0.0128) was also fit-
ted from the observed simulated data and used for probabil-
ity calculations. In the second analysis, following Montanucci
et al. (2011), a matrix of pairwise shortest path distances be-
tween the nodes (path distance matrix) and threematrices of
absolute pairwise gene differences in each of the evolutionary
parameters were computed. Each of these last matrices was
then compared with the path distance matrix through stan-
dardized Mantel tests using the ecodist package. All the anal-
yses discussed in this section were conducted using the R
statistical programming environment.
Supplementary Material
Supplementary figures S1–S9 and tables S1–S12 are available
at Molecular Biology and Evolution online (http://www.mbe.
oxfordjournals.org/).
Acknowledgments
E.R.A.B. acknowledges the support of the Miller Institute for
Basic Research in Science, University of California, Berkeley,
while spending a sabbatical leave in the lab of Chelsea Specht.
The authors acknowledge technical support of Rigoberto V.
Pérez-Ruiz and logistical and administrative help of Diana
Romo. This article constitutes a partial fulfillment of the grad-
uate program Doctorado en Ciencias Biomédicas of the
Universidad Nacional Autónoma de México, UNAM in
which J.D.-V. developed this project. This work was supported
by grants from CONACYT, Mexico: 180098 (to E.R.A.B.) from
PAPIIT-UNAM, IN203113-3 (to E.R.A.B.).
References
Agrafioti I, Swire J, Abbott J, Huntley D, Butcher S, Stumpf MP. 2005.
Comparative analysis of the Saccharomyces cerevisiae and
Caenorhabditis elegans protein interaction networks. BMC Evol
Biol. 5:23.
Alvarez-Buylla ER, Azpeitia E, Barrio R, Benitez M, Padilla-Longoria P.
2010. From ABC genes to regulatory networks, epigenetic land-
scapes and flower morphogenesis: making biological sense of theo-
retical approaches. Semin Cell Dev Biol. 21:108–117.
Alvarez-Buylla ER, Chaos A, Aldana M, et al. (11 co-authors). 2008. Floral
morphogenesis: stochastic explorations of a gene network epige-
netic landscape. PLoS One 3:e3626.
Alvarez-Buylla ER, Pelaz S, Liljegren SJ, Gold SE, Burgeff C, Ditta GS, De
Pouplana LR, Mart́ınez-Castilla L, Yanofsky MF. 2000. An ancestral
MADS-box gene duplication occurred before the divergence of
plants and animals. Proc Natl Acad Sci U S A. 97:5328–5333.
Alvarez-Ponce D. 2012. The relationship between the hierarchical posi-
tion of proteins in the human signal transduction network and their
rate of evolution. BMC Evol Biol. 12:192.
Alvarez-Ponce D, Aguadé M, Rozas J. 2009. Network-level molecular
evolutionary analysis of the insulin/TOR signal transduction path-
way across 12 Drosophila genomes. Genome Res. 19:234–242.
Alvarez-Ponce D, Aguadé M, Rozas J. 2011. Comparative genomics
of the vertebrate insulin/TOR signal transduction pathway: a
network-level analysis of selective pressures. Genome Biol Evol. 3:
87–101.
Alvarez-Ponce D, Fares MA. 2012. Evolutionary rate and duplicability in
the Arabidopsis thaliana protein-protein interaction network.
Genome Biol Evol. 4:1263–1274.
Becker A, Theissen G. 2003. The major clades of MADS-box genes and
their role in the development and evolution of flowering plants.Mol
Phylogenet Evol. 29:464–489.
Bowers JE, Chapman BA, Rong J, Paterson AH. 2003. Unravelling angio-
sperm genome evolution by phylogenetic analysis of chromosomal
duplication events. Nature 422:433–438.
Bremer B, Bremer K, Chase M, Fay M, Reveal J, Soltis D, Soltis P, Stevens
P. 2009. An update of the Angiosperm Phylogeny Group classifica-
tion for the orders and families of flowering plants: APG III. Bot J Linn
Soc. 161:105–121.
Casals F, Sikora M, Laayouni H, Montanucci L, Muntasell A, Lazarus R,
Calafell F, Awadalla P, Netea MG, Bertranpetit J. 2011. Genetic ad-
aptation of the antibacterial human innate immunity network. BMC
Evol Biol. 11:202.
Coen ES, Meyerowitz EM. 1991. The war of the whorls: genetic interac-
tions controlling flower development. Nature 353:31–37.
Cork JM, Purugganan MD. 2004. The evolution of molecular genetic
pathways and networks. Bioessays 26:479–484.
Csardi G, Nepusz T. 2006. The igraph software package for complex
network research. InterJournal, Complex Systems 1695:5.
Espinosa-Soto C, Padilla-Longoria P, Alvarez-Buylla ER. 2004. A gene
regulatory network model for cell-fate determination during
Arabidopsis thaliana flower development that is robust and
recovers experimental gene expression profiles. Plant Cell 16:
2923–2939.
Fraley C, Raftery AE, Murphy TB, Scrucca L. 2012. MCLUST version 4 for
R: normal mixture modeling for model-based clustering, classifica-
tion, and density estimation. Technical Report no. 597, Department
of Statistics, University of Washington.
Fraser HB, Hirsh AE, Steinmetz LM, Scharfe C, Feldman MW. 2002.
Evolutionary rate in the protein interaction network. Science
296(5568):750–752.
Fitzpatrick DA, O’Halloran DM. 2012. Investigating the relationship be-
tween topology and evolution in a dynamic nematode odor genetic
network. Int J Evol Biol. 2012(2012):548081.
Guindon S, Dufayard J, Lefort V, Anisimova M, Hordijk W, Gascuel O.
2010. New algorithms and methods to estimate maximum-likeli-
hood phylogenies: assessing the performance of PhyML 3.0. Syst Biol.
59:307–321.
13
Molecular Evolution Constraints in FOS-GRN . doi:10.1093/molbev/mst223 MBE
 at U
N
IV
E
R
S
IT
A
T
 P
O
M
P
E
U
 F
A
B
R
A
 o
n
 Jan
u
ary
 1
2
, 2
0
1
4
h
ttp
://m
b
e.o
x
fo
rd
jo
u
rn
als.o
rg
/
D
o
w
n
lo
ad
ed
 fro
m
 
Guindon S, Gascuel O. 2003. A simple, fast, and accurate algorithm to
estimate large phylogenies by maximum likelihood. Syst Biol. 52:
696–704.
Ha M, Kim ED, Chen ZJ. 2009. Duplicate genes increase expression
diversity in closely related species and allopolyploids. Proc Natl
Acad Sci U S A. 106:2295–2300.
Hahn MW, Conant GC, Wagner A. 2004. Molecular evolution in large
genetic networks: does connectivity equal constraint? J Mol Evol. 58:
203–211.
Hahn MW, Kern AD. 2005. Comparative genomics of centrality and
essentiality in three eukaryotic protein-interaction networks. Mol
Biol Evol. 22(4):803–806.
Huang S, Kauffman S. 2009. Complex gene regulatory networks-from
structure to biological observables: cell fate determination. In:
Meyers RA, editor. Encyclopedia of complexity and systems science.
Berlin: Springer. p. 1180–1293.
Invergo BM, Montanucci L, Laayouni H, Bertranpetit J. 2013. A system-
level, molecular evolutionary analysis of mammalian phototransduc-
tion. BMC Evol Biol. 13:52.
Jovelin R, Phillips PC. 2009. Evolutionary rates and centrality in the yeast
gene regulatory network. Genome Biol. 10:R35.
Klemens B. 2008. Modeling with data: tools and techniques for scientific
computing. Princeton (NJ): Princeton University Press.
Krizek BA, Fletcher JC. 2005. Molecular mechanisms of flower develop-
ment: an armchair guide. Nat Rev Genet. 6:688–698.
LarkinMA, Blackshields G, BrownNP, et al. (13 co-authors). 2007. Clustal
W and Clustal X version 2.0. Bioinformatics 23:2947–2948.
Lavagnino N, Serra F, Arbiza L, Dopazo H, Hasson E. 2012. Evolutionary
genomics of genes involved in olfactory behavior in the Drosophila
melanogaster species group. Evol Bioinform Online. 8:89–104.
Lawton-Rauh AL, Alvarez-Buylla ER, Purugganan MD. 2000. Molecular
evolution of flower development. Trends Ecol Evol. 15:144–149.
Lemos B, Bettencourt BR, Meiklejohn CD, Hartl DL. 2005. Evolution of
proteins and gene expression levels are coupled in Drosophila and
are independently associated with mRNA abundance, protein
length, and number of protein-protein interactions. Mol Biol Evol.
22(5):1345–1354.
Martı́nez-Castilla L, Alvarez-Buylla ER. 2003. Adaptative evolution in the
Arabidopsis MADS-box gene family inferred from its complete re-
solved phylogeny. Proc Natl Acad Sci U S A. 100(23):13407–13412.
Mendoza L, Alvarez-Buylla ER. 1998. Dynamics of the genetic regulatory
network for Arabidopsis thaliana flower morphogenesis. J Theor Biol.
193(2):307–319.
Montanucci L, Laayouni H, Dall’Olio GM, Bertranpetit J. 2011. Molecular
evolution and network-level analysis of the N-glycosylation meta-
bolic pathway across primates. Mol Biol Evol. 28:813–823.
Moore RC, Grant SR, Purugganan MD. 2005. Molecular population ge-
netics of redundant floral-regulatory genes in Arabidopsis thaliana.
Mol Biol Evol. 22:91–103.
Murphy K. 2012. Machine learning: a probabilistic approach. Cambridge
(MA): MIT Press.
Olsen KM, Womack A, Garrett AR, Suddith JI, Purugganan MD. 2002.
Contrasting evolutionary forces in the Arabidopsis thaliana floral
developmental pathway. Genetics 160:1641–1650.
Paradis E. 2012. Analysis of phylogenetics and evolution with R.
New York: Springer.
Paradis E, Claude J, Strimmer K. 2004. APE: analyses of phylogenetics and
evolution in R language. Bioinformatics 20:289–290.
Preston JC, Kellogg EA. 2007. Conservation and divergence of
APETALA1/FRUITFULL-like gene function in grasses: evidence
from gene expression analyses. Plant J. 52:69–81.
Proost S, Van Bel M, Sterck L, Billiau K, Van Parys T, Van de Peer Y,
Vandepoele K. 2009. PLAZA: a comparative genomics resource to
study gene and genome evolution in plants. Plant Cell 21:3718–3731.
Purugganan MD. 1997. The MADS-box floral homeotic gene lineages
predate the origin of seed plants: phylogenetic and molecular clock
estimates. J Mol Evol. 45:392–396.
Purugganan MD, Rounsley SD, Schmidt RJ, Yanofsky MF. 1995.
Molecular evolution of flower development: diversification of the
plant MADS-box regulatory gene family. Genetics 140:345–356.
Purugganan MD, Suddith JI. 1999. Molecular population genetics of
floral homeotic loci: departures from the equilibrium-neutral
model at the APETALA3 and PISTILLATA genes of Arabidopsis
thaliana. Genetics 151:839–848.
Riechmann JL, Meyerowitz EM. 1997. MADS domain proteins in plant
development. J Biol Chem. 378:1079.
Sanchez-Corrales YE, Alvarez-Buylla ER, Mendoza L. 2010. The
Arabidopsis thaliana flower organ specification gene regulatory net-
work determines a robust differentiation process. J Theor Biol. 264:
971–983.
Schliep KP. 2011. phangorn: phylogenetic analysis in R. Bioinformatics 27:
592–593.
Suyama M, Torrents D, Bork P. 2006. PAL2NAL: robust conversion of
protein sequence alignments into the corresponding codon align-
ments. Nucleic Acids Res. 34:W609–W612.
Villarreal C, Padilla-Longoria P, Alvarez-Buylla ER. 2012. General theory of
genotype to phenotype mapping: derivation of epigenetic land-
scapes from N-node complex gene regulatory networks. Phys Rev
Lett. 109:118102.
Wellmer F, Riechmann JL. 2010. Gene networks controlling the initiation
of flower development. Trends Genet. 26:519–527.
Yang Z. 2007. PAML 4: phylogenetic analysis by maximum likelihood.
Mol Biol Evol. 24:1586–1591.
Yang L, Chun-Ce G, Gui-Xia H, Hong-Yan S, Hong-Zhi K. 2011.
Evolutionary pattern of the regulatory network for flower develop-
ment: insights gained from a comparison of two Arabidopsis species.
J Syst Evol. 49:528–538.
Yang Y, Zhang F, Ge S. 2009. Evolutionary rate patterns of the
Gibberellin pathway genes. BMC Evol Biol. 9:206.
Yang Z, Bielawski JP. 2000. Statistical methods for detecting molecular
adaptation. Trends Ecol Evol. 15:496–503.
14
Davila-Velderrain et al. . doi:10.1093/molbev/mst223 MBE
 at U
N
IV
E
R
S
IT
A
T
 P
O
M
P
E
U
 F
A
B
R
A
 o
n
 Jan
u
ary
 1
2
, 2
0
1
4
h
ttp
://m
b
e.o
x
fo
rd
jo
u
rn
als.o
rg
/
D
o
w
n
lo
ad
ed
 fro
m
 
XAANTAL2 (AGL14) Is an ImportantComponentof
the Complex Gene Regulatory Network that
Underlies Arabidopsis Shoot Apical Meristem
Transitions
Rigoberto V. Pérez-Ruiz1,4, Berenice Garcı́a-Ponce1,4,*, Nayelli Marsch-Martı́nez1,5,
Yamel Ugartechea-Chirino1, Mitzi Villajuana-Bonequi1,6, Stefan de Folter2, Eugenio Azpeitia1,7,
José Dávila-Velderrain1, David Cruz-Sánchez1, Adriana Garay-Arroyo1,
Marı́a de la Paz Sánchez1, Juan M. Estévez-Palmas1 and Elena R. Álvarez-Buylla1,3,*
1Instituto de Ecologı́a, Universidad Nacional Autónoma de México, 3er Circuito Exterior s/no, Junto al Jardı́n Botánico, and Centro de Ciencias de la Complejidad
Ciudad Universitaria, Coyoacán 04510, México D.F., Mexico
2Laboratorio Nacional de Genómica para la Biodiversidad (Langebio), Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional
(CINVESTAV-IPN), Km. 9.6 Carretera Irapuato - León, AP 629, 36821 Irapuato, Guanajuato, Mexico
3University of California, 431 Koshland Hall, Berkeley, CA 94720, USA
4These authors contributed equally to this article.
5Present address: Departamento de Biotecnologı́a y Biquı́mica, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional
(CINVESTAV-IPN), Km. 9.6 Libramiento Norte, Carr. Irapuato-León, 36821 Irapuato, Guanajuato, Mexico
6Present address: Max Planck Institute for Plant Breeding Research, D-50829 Cologne, Germany
7Present address: INRIA Project-Team Virtual Plants/CIRAD/INRA, UMR AGAPCampus St Priest - BAT 5, CC 05018, 860 rue de St Priest, 34095 Montpellier Cedex
5, France
*Correspondence: Berenice Garcı́a-Ponce (bgarcia@ecologia.unam.mx), Elena R. Álvarez-Buylla (eabuylla@gmail.com)
http://dx.doi.org/10.1016/j.molp.2015.01.017
ABSTRACT
InArabidopsis thaliana, multiple genes involved in shoot apical meristem (SAM) transitions have been char-
acterized, but the mechanisms required for the dynamic attainment of vegetative, inflorescence, and floral
meristem (VM, IM, FM) cell fates during SAM transitions are not well understood. Here we show that a
MADS-box gene, XAANTAL2 (XAL2/AGL14), is necessary and sufficient to induce flowering, and its regula-
tion is important in FM maintenance and determinacy. xal2 mutants are late flowering, particularly under
short-day (SD) condition, while XAL2 overexpressing plants are early flowering, but their flowers have vege-
tative traits. Interestingly, inflorescences of the latter plants have higher expression levels of LFY, AP1, and
TFL1 thanwild-type plants. In additionwe found that XAL2 is able to bind theTFL1 regulatory regions.On the
other hand, the basipetal carpels of the 35S::XAL2 lines lose determinacy and maintain high levels of WUS
expression under SD condition. To provide amechanistic explanation for the complex roles of XAL2 in SAM
transitions and the apparently paradoxical phenotypes of XAL2 and other MADS-box (SOC1, AGL24) over-
expressors,we conducted dynamic gene regulatory network (GRN) and epigenetic landscapemodeling.We
uncovered aGRNmodule that underlies VM, IM, and FMgene configurations and transition patterns inwild-
type plants as well as loss and gain of function lines characterized here and previously. Our approach thus
provides a novel mechanistic framework for understanding the complex basis of SAM development.
Key words: XAL2/AGL14, MADS-box, TFL1, SAM transitions, floral reversion, gene regulatory networks,
epigenetic landscape modeling
Pérez-Ruiz R.V., Garcı́a-Ponce B., Marsch-Martı́nez N., Ugartechea-Chirino Y., Villajuana-Bonequi M.,
de Folter S., Azpeitia E., Dávila-Velderrain J., Cruz-Sánchez D., Garay-Arroyo A., Sánchez M.P., Estévez-
Palmas J.M., and Álvarez-Buylla E.R. (2015). XAANTAL2 (AGL14) Is an Important Component of the Complex
Gene Regulatory Network that Underlies Arabidopsis Shoot Apical Meristem Transitions. Mol. Plant. 8, 796–813.
Published by the Molecular Plant Shanghai Editorial Office in association with
Cell Press, an imprint of Elsevier Inc., on behalf of CSPB and IPPE, SIBS, CAS.
796 Molecular Plant 8, 796–813, May 2015 ª The Author 2015.
Molecular Plant
Research Article
INTRODUCTION
Unraveling the molecular genetic mechanisms that underlie cell
transitions and plasticity is a fundamental issue in developmental
biology. Different cell states (e.g., proliferative, differentiated,
transdifferentiated, or reprogrammed) are correlated to different
combinations of gene activation (Sugimoto et al., 2011). Such
gene configurations, and the transitions among them, emerge
from complex regulatory networks (Álvarez-Buylla et al., 2010a,
2010b). Plants enable in vivo analyses of the molecular genetic
mechanisms underlying such cell plasticity and dynamics of
stem cells that remain active during their complete life cycle
within meristems.
At the shoot apical meristem (SAM) the transition from a vegeta-
tive to a reproductive state is crucial, with direct fitness implica-
tions (Roux et al., 2006). Molecular genetic approaches have
uncovered a complex gene regulatory network (GRN) underlying
Arabidopsis SAM development (Srikanth and Schmid, 2011;
Andrés and Coupland, 2012). Genetic screenings for mutant
plants with altered bolting time under contrasting environmental
conditions (Koornneef et al., 1991) have uncovered the
components of flowering transition pathways in response to:
photoperiod (Putterill et al., 1995; Suárez-López et al., 2001;
An et al., 2004), gibberellins (gibberellic acid [GA]; Blázquez
et al., 1998; Blázquez and Weigel, 2000; Porri et al., 2012),
non-optimal growth temperature over 4C (Blázquez et al.,
2003; Halliday et al., 2003; Balasubramanian et al., 2006;
Lee et al., 2007), vernalization (Michaels and Amasino, 1999;
Sheldon et al., 2000; Michaels et al., 2003), or internal
developmental cues (Koornneef et al., 1991; Simpson, 2004;
Wu and Poethig, 2006).
Many of the genes that participate in floral transition are MADS-
box genes (Gramzow et al., 2010). Some of them, such as
SUPPRESSOR OF OVEREXPRESSION OF CONSTANS 1
(SOC1), respond to more than one condition, and these have
been called integrators (Blázquez and Weigel, 2000; Lee et al.,
2000; Moon et al., 2003; Wang et al., 2009; Lee and Lee,
2010). Detailed functional characterization revealed that
flowering transition pathways converge in the regulation of
LEAFY (LFY) and APETALA1 (AP1), via SOC1–AGAMOUS-
LIKE 24 (AGL24) heterodimer, SQUAMOSA PROMOTER
BINDING PROTEIN-LIKE (SPL3) or FLOWERING LOCUS T–
FLOWERING LOCUS D (FT–FD) complex, at the founding cells
of the floral meristem (FM), thus establishing a new identity
distinct from the inflorescence meristem (IM). The FM later
sub-differentiates into the floral organs (Schultz and Haughn,
1991; Weigel et al., 1992; Abe et al., 2005; Yamaguchi et al.,
2009).
Gene expression configurations that characterize the IM and FM
identities, in addition to the floral organ primordia, have started to
be recovered and explained with dynamic GRN mechanistic
models, as attractors or steady states (Espinosa-Soto et al.,
2004; Álvarez-Buylla et al., 2010a; van Mourik et al., 2010;
Kaufmann et al., 2011; Jaeger et al., 2013). Such mechanistic
explanations are still lacking for normal and altered cell-fate tran-
sitions at the SAM in wild-type plants, and for certain MADS-box
overexpression lines (Yu et al., 2004; Ferrario et al., 2004; Liu
et al., 2007; Fornara et al., 2008).
The coexistence and, at the same time, the clear distinction of IM
and FM suggest a common underlying dynamic multi-stable
mechanism. Some genes have been identified as critical markers
of each of these SAM cellular identities, while others are shared
among them. Distinction between IM and FM depends on the
mutual repression of floral meristem identity genes, such as
LFY, AP1, and CAULIFLOWER (CAL), and IM genes, particularly
TERMINAL FLOWER1 (TFL1), an important regulator of inflores-
cence development (Shannon and Meeks-Wagner, 1991;
Alvarez et al., 1992; Weigel et al., 1992; Bowman et al., 1993;
Shannon and Meeks-Wagner, 1993; Gustafson-Brown et al.,
1994; Chen et al., 1997; Ohshima et al., 1997; Ratcliffe et al.,
1998, 1999; Ferrándiz et al., 2000; Parcy et al., 2002). TFL1
encodes a phosphatidylethanolamine-binding protein (PEBP)
that is transcribed in the center of the IM, but the protein moves
to other cells where AP1 and LFY are down-regulated (Bradley
et al., 1997; Conti and Bradley, 2007). tfl1 is an early flowering
mutant with a determinate inflorescence due to the ectopic
expression of LFY and AP1 in the IM (Shannon and Meeks-
Wagner, 1991; Schultz and Haughn, 1993; Gustafson-Brown
et al., 1994; Mandel and Yanofsky, 1995; Liljegren et al., 1999).
Conversely, single and double mutants of LFY and AP1 acquire
inflorescence-like structures because of the ectopic expression
of TFL1 (Huala and Sussex, 1992; Bowman et al., 1993; Bradley
et al., 1997; Ratcliffe et al., 1998, 1999; Benlloch et al., 2007).
Recent data show that the tight spatial and temporal regulation of
the components of the GRN underlying the transition to flowering
is also involved in FM identity and maintenance (Liu et al., 2009;
Posé et al., 2012). In this sense, genes such as SOC1, AGL24,
and SHORT VEGETATIVE PHASE (SVP), known to participate
in the regulation of flowering transition by regulating LFY in the
case of the first two genes (Lee et al., 2008; Liu et al., 2008),
and SVP in collaboration with FLOWERING LOCUS C (FLC) by
repressing SOC1 and FT (Hartmann et al., 2000; Lee et al.,
2007; Li et al., 2008), are also important during the first two
stages of flower development (Gregis et al., 2009; Liu et al.,
2009). At these stages, SOC1, AGL24, and SVP help to prevent
the premature expression of the B and C genes (Gregis et al.,
2006, 2009; Liu et al., 2009). Moreover SOC1, AGL24, SVP, and
SEP4 with AP1 repress the expression of TFL1 in the FM (Liu
et al., 2013). At stage 3 of FM development, AGL24 and SVP
are repressed by LFY and AP1, leading to further differentiation
and determinacy (Yu et al., 2004; Liu et al., 2007). Meanwhile,
expression of SOC1 and FRUITFULL (FUL, another MADS-box
gene) in the IM is important to repress secondary vascular growth
(Melzer et al., 2008). Therefore, SOC1, AGL24, SVP, and FUL are
important in both flowering transition, and floral and inflorescence
meristems identity and maintenance.
Additional evidence for the common underlying multi-stable
and non-linear GRN for SAM states and transitions is the
fact that several of the aforementioned MADS-domain proteins
are involved in multiple SAM states and transitions (Smaczniak
et al., 2012), sometimes with apparently paradoxical functions.
The overexpression of some MADS-box genes, such as
AGL24 or SOC1 and their homologs, induce early flowering
by up-regulating LFY and AP1 (Lee et al., 2000; Yu et al.,
2002; Michaels et al., 2003; Lee et al., 2008), but at the same
time produce flowers with vegetative characteristics that
resemble the ap1 mutant with elongated carpels, especially
Molecular Plant 8, 796–813, May 2015 ª The Author 2015. 797
XAANTAL2 (AGL14) in Arabidopsis SAM Transitions Molecular Plant
under short-day (SD) condition (Irish and Sussex, 1990; Bowman
et al., 1993; Borner et al., 2000; Ferrario et al., 2004; Masiero
et al., 2004; Yu et al., 2004; Liu et al., 2007; Trevaskis et al.,
2007; Fornara et al., 2008). The phenomenon known as ‘‘floral
reversion’’ has been also described in heterozygous lfy, ap1,
ap2, and agamous (ag) mutants, suggesting that these
genes repress this process and favor FM determinacy (Battey
and Lyndon, 1990; Okamuro et al., 1993, 1996, 1997). There
is no explanation or mechanistic model to account for the
permanence of inflorescence characteristics when LFY and
AP1 are prematurely expressed in theMADS-box overexpression
lines.
XAANTAL2 (XAL2/AGL14) is a MADS-box gene preferentially
expressed in the root (Rounsley et al., 1995; Garay-Arroyo
et al., 2013). The name XAANTAL2 was given because xal2
mutants have short roots similar to those of xaantal1/agl12
(Tapia-López et al., 2008; Garay-Arroyo et al., 2013). Here, we
report that XAL2 is also a key player in SAM cell identities and
transitions. It promotes flowering and presents similar loss and
gain of function phenotypes such as AGL24 and SOC1. We
also show that overexpression of XAL2, SOC1, and AGL24 are
able to up-regulate TFL1, thus explaining, at least in part,
the prevalence of vegetative traits, even if AP1 and LFY are
prematurely expressed, supporting that XAL2 is also important
for FM maintenance. Here, we propose a dynamic GRN and
epigenetic landscape (EL) models (Álvarez-Buylla et al., 2008,
2010b; Villarreal et al., 2012) that integrate our data with
previous results to provide a mechanistic and dynamic
framework to understanding normal and altered cell fates and
transitions at the Arabidopsis SAM. This model thus provides a
mechanistic explanation for apparently paradoxical data for
other loss and gain of function phenotypes (Borner et al., 2000;
Ferrario et al., 2004; Masiero et al., 2004; Yu et al., 2004; Liu
et al., 2007; Trevaskis et al., 2007; Fornara et al., 2008) allowing
the integration of additional components.
RESULTS
XAL2 Promotes Flowering Transition
XAL2 is a member of the TM3/SOC1 clade, belonging to the type
II MADS-box genes (Álvarez-Buylla et al., 2000; Martı́nez-Castilla
and Álvarez-Buylla, 2003; Parenicová et al., 2003; Smaczniak
et al., 2012). Except for XAL2 (Garay-Arroyo et al., 2013), all
other members of this clade have been identified as activators
of flowering transition (Lee et al., 2000; Moon et al., 2003;
Schmid et al., 2003; Schönrock et al., 2006; Dorca-Fornell
et al., 2011). Given the role of all other members of SOC1
clade, we hypothesized that XAL2 could also be involved in
flowering and tested two xal2 alleles under four conditions:
long-day (LD) and SD photoperiods, vernalization plus LD, and
GA3 treatment plus SD. In addition, we generated doublemutants
using the xal2-2 allele (which has less somatic En-excision rates
than xal2-1) and soc1-6, agl24-4, and ful-7 mutants, because
SOC1, AGL24, and FUL proteins interact with XAL2 in the yeast
two-hybrid system, suggesting that they form dimers (de Folter
et al., 2005; van Dijk et al., 2010).
Under LD condition both xal2 alleles (Garay-Arroyo et al., 2013)
showed a subtle but significant delay in bolting time (Figure 1A
and 1B and Supplemental Table 1). Under the same condition,
soc1-6 was epistatic over xal2-2, while xal2-2 and ful-7 had a
slightly additive effect on bolting time compared with the
parental plants. No differences were observed in the xal2-2
agl24-4 double mutant with respect to single mutants
(Figure 1B). Interestingly, the rosette leaf number (RLN) did not
always coincide with the bolting time phenotype (Figure 1B and
1C). In fact, xal2-1 and xal2-2 alleles and xal2-2 ful-7 have the
same number of leaves as wild-type plants under LD condition,
while xal2-2 soc1-6 double mutants had fewer leaves than
soc1-6 (Figure 1B and Supplemental Table 1).
Under SD condition both xal2 alleles are remarkably delayed
compared with wild-type plants and only xal2-2 soc1-6 plants
showed an additive bolting time phenotype in comparison with
both parentals (Figure 1C and Supplemental Table 1). However,
xal2-2 was epistatic over agl24-4 and ful-7 mutants under this
condition (Figure 1C). Unexpectedly, the xal2-2 soc1-6 RLN
is lower than in both parental lines (Figure 1C). Therefore,
it seems that XAL2 effects on bolting time and rosette leaf
development are partially independent. We also found that
cauline leaf number is diminished in xal2-2 only under SD
condition and is epistatic over soc1-6, agl24-4, and ful-7
(Supplemental Figure 1A).
Since GA plays a relevant role in flowering under SD, we tested
the effect of this hormone in all mutants. GA application partially
suppressed flowering phenotypes under SD condition in all cases
except for xal2-2 soc1-6 (Figure 1C and 1D). Interestingly, 62% of
the xal2-2 soc1-6 plants grown under SD condition were unable
to flower after 117 days after sowing (DAS), and none of them
flowered after GA treatment (88 DAS), thus suggesting that
XAL2 and SOC1 additively participate in GA response during
flowering transition. To explore how the impairment of GA
response in xal2-2 soc1-6 affects GA homeostasis, we assayed
two GA biosynthesis genes (GA20OX1 and 2) and a catabolic
one (GA2OX1; Rieu et al., 2008) at 14 DAS, when most of the
flowering time genes are up-regulated under LD condition. Our
results in the double mutant showed up-regulation of GA20OX1
compared with xal2-2 and down-regulation of GA2OX1
compared with wild-type plants (Supplemental Figure 2A). This
finding suggests a compensatory mechanism in which the plant
tries to make up for reduced GA responses by producing more
GA. Further analysis should be performed to clarify the role of
XAL2 in relation to SOC1 in GA homeostasis during flowering
transition.
Overall, our results for single and double mutants indicate that
both xal2 alleles have a delayed bolting time compared with
wild-type plants under all conditions tested, except for vernaliza-
tion treatment (Figure 1A–1D and Supplemental Table 1). To
further explore the role of XAL2 in flowering transition and to
uncover possible redundancies of this gene with other related
MADS-box genes, we generated several 35S::XAL2 lines and
selected three of them that showed the highest levels of XAL2
transcript accumulation (Supplemental Figure 2B) and similar
phenotypes among them (see description in the following
paragraphs). In Figure 1E and Supplemental Figure 1B we show
that 35S::XAL2 line (9T4) has a similar early bolting time and
fewer rosette and cauline leaves in comparison with wild-type
plants, under both LD and SD condition. Therefore, XAL2 is
798 Molecular Plant 8, 796–813, May 2015 ª The Author 2015.
Molecular Plant XAANTAL2 (AGL14) in Arabidopsis SAM Transitions
Figure 1. XAL2 Participates in Flowering Transition.
(A) The mutant allele xal2-2 and the overexpression line 35S::XAL2 are late and early flowering compared with wild-type (WT) plants, respectively.
(B) Flowering time of double mutant plants xal2-2 ful-7, xal2-2 agl24-4, and xal2-2 soc1-6 compared with parental and WT plants grown under long-day
(LD) condition, showing that soc1-6 is epistatic over xal2-2. DAS, days after sowing.
(C) The same plants grown under short-day (SD) condition showed that the xal2-2 soc1-6 double mutant plants have an additive effect compared with the
parental and WT plants.
(D)GA3 application mostly suppressed the late flowering phenotype of all genotypes. Note that none of the xal2-2 soc1-6 double mutant plants flowered
after 88 DAS.
(E) Overexpression of XAL2 is sufficient to induce a similar early bolting time phenotype under LD and SD conditions.
Flowering transition was analyzed as the bolting time (gray bars) expressed in DAS and the rosette leaf number (white bars) as mean ± standard error
(n = 35–42 plants under LD and n = 16–23 under SD and SD+GA). Lineswith statistically significant differences compared withWT plants (black asterisks)
or single mutants (red asterisks) are indicated as *P < 0.05, **P < 0.01, and ***P < 0.001 according to one-way analysis of variance (ANOVA) following
Tukey’s multiple comparison test.
Molecular Plant 8, 796–813, May 2015 ª The Author 2015. 799
XAANTAL2 (AGL14) in Arabidopsis SAM Transitions Molecular Plant
m
 
B
o
lt
in
g
 t
im
e 
(D
A
S
) 
N
 
.¡
,.
 
0
\
 
0
0
 
o
 
o
 
o
 
o
 
o
 
Ip
, 
J~ 
/>
.(
 
<.
P.
 
<:
J 
3
4
.3
 
"~ 
<2
.( 
* 
1
8
.3
 
~ 
<:
J 
2
 
* 
J~ 
~.( 
* * 
H
71
.1
 
<.
P.
 
<:
J 
"~ 
46
.1
 
<2
&
 
2
2
4
 
t 
'(
) 
. 
*
 
o
 
N
 
.¡
,.
 
0
\
 
0
0
 
o
 
o
 
o
 
o
 
R
o
se
tt
e 
le
a
f 
n
u
m
b
e
r 
B
o
lt
in
g
 t
im
e 
(D
A
S
) 
("
) 
....
.. 
....
.. 
N
 
.¡
,.
 
0
\
 
0
0
 
o
 
N
 
o
 
o
 
o
 
o
 
o
 
o
 
o
 
+
. 
ft-.
.j>
 
7
2
.5
 
Q
.¿
 
5
9
.4
 
* 
<P
° 0
 
~ 
~
~5.6
~~ 
Q.~ 
H
7
8
.5
/ 
<
P
/' 
a
' 
7
4
 
* 
?
v
 
: 
7
8
.4
"*
 
+
. 
'v
 
+
. 
Q.¿~ 
57
 
Q
.¿
 
~ 
¿-
-'.
.>
 
~ 
8
0
.9
 
~ 
~~ 
57
.9
 
+
Q
. 
Q
. 
¿-
-'.
.>
 
9
2
.2
 
?
<
P
/'
 
54
.1
 
~?v 
~
92.8 
<P
o 
'v
 
o,.
....
. 
65
.6
 
* 
~ 
10
7 
~ 
'a
' 
6
4
.4
 
* 
o
 
N
 
.¡
,.
 
0
\
 
0
0
 
....
.. 
o
 
o
 
o
 
o
 
o
 
o
 
R
o
se
tt
e 
le
a
f 
n
u
m
b
e
r 
B
o
lt
in
g
 t
im
e 
(D
A
S
) 
N
 
.¡
,.
 
0
\
 
o
 
o
 
o
 
o
 
+
. 
~ 
j=
::;
:;;
:;:
 
Q
.¿
 
if>
. 
'.
.,
 
0
0
 
.
.
.
 
.I-
-~ 
Q
: 
~ 
-
I
--
-
-
-t
J 
<
P
/' 
a
' 
:1
==
=:
11
'4"2
' 
?
v
 
+
. 
'v
-
,-
_
_
 
0
0
 
o
 
+
. 
Q.¿~¿ 
Q
/,
 
~ 
'.
.>
 
?~~ 
+
. 
Q
: 
¿-
-'.
.>
 
6
0
.1
 
-o o
 
Q
.¿
 <
P
"2
 
3
5
.6
 
~ 
v 
1:;
;;;
 
16
8
.5
 
....
.. 
N
 o
 
N
 o
 
<P
o 
'v
 
3
2
 
o,.
....
. 
:r::
==
==
==
==
=::
J.
'-..J 
'a
'I
 
o
 
N
 
.¡
,.
 
0
\
 
0
0
 
...
...
 
...
...
 
o
 
o
 
o
 
o
 
o
 
N
 
o
 
o
 
R
o
se
tt
e 
le
a
f 
n
u
m
b
e
r 
e 
w
 
~ ~ t 
I .'
(
. 
~ 
ré
"·
·, 
/
,
' 
, 
~.
~'
, 
1
1
' 
¿~\ 
" 
" 
:..,,,
, 
::;
 
' •.
 >~-
~ 
)
¡~ 
, 
~
,' 
~ 
. 
1\;
'
.1
 
$:
:¡ 
;.j
.
\ 
~
~; 
1
 "-
, 
'
f
. 
,I
l 
W
 
t,'
,' 
.~ 
. 
»
 
B
o
lt
in
g
 t
im
e
 (
D
A
S
) 
O
J 
....
.. 
N
 
W
 
.¡
,.
 
o
 
o
 
o
 
o
 
o
 
+.~ Q
.¿
 
E=
==~==
;;
29 
<P
o 
~ 
o,.
....
. 
1
3
2
 
~ 
~/' 
'a
' 
* 
+
. 
?~ 
2
3
9
 
*
3
9
.9
 ;
 
+.Q/,
,~v 
. 
* 
~ 
Q
/.
.,
 ~~ 
..
..
 <
 
3
0
.8
*
 ~ 
..
. ~ 
..
. 
:.
>
 
.* 
* 
+
Q
/,
 
Q
. ~¿ 
3
2
.2
 
~ 
?
 
<
P
/' 
'.
.>
 
* 
* 
3
4
.1
 
~ 
~?v 
<P
o 
'v
 
-
-
~ 
J 
1.
9 
o,.
....
. 
~ 
'a
' o
 
....
.. 
o
 
~ 
19
.5
 
* 
W
 
o
 
3
1
.9
 H
 3
9
.2
 
-
r
o
-
.¡
,.
 
o
 
R
o
se
tt
e 
le
a
f 
n
u
m
b
e
r 
sufficient to induce early flowering independently of photoperiod,
and may participate in the IM to FM transition. These results
confirm that XAL2 is a key component of the GRN that controls
flowering transition.
XAL2 Is Part of the GRN that Induces AP1 during Floral
Transition
We then analyzed the role of XAL2 in the flowering transition GRN
using quantitative RT–PCR. In agreement with the flowering tran-
sition phenotypes observed in Figure 1, we found that AP1
expression was down-regulated in xal2-2 (Figure 2A) while LFY
did not show significant repression (Figure 2B). In contrast,
both genes were up-regulated in the XAL2 overexpression line
(Figure 2A and 2B). SOC1 and AGL24 were also down-
regulated in xal2-2, indicating that XAL2 positively regulates
both genes. Surprisingly, XAL2 overexpression drastically
repressed SOC1 and AGL24 (Figure 2C and 2D). Therefore, it is
possible that the early flowering phenotype observed in the
XAL2 overexpression line is due to an up-regulation of LFY and
AP1 and that this is partially independent of SOC1-AGL24.
XAL2 is down-regulated in constans-1 mutant (co-1; Han et al.,
2008), indicating that CO positively regulates XAL2 when plants
are grown under LD condition (Figure 2E). XAL2 transcript
accumulation in soc1-6 (Wang et al., 2009) and agl24-4 was
unaffected (Figure 2F and 2G). Thus, at the transcriptional level,
XAL2 is regulated by CO and positively regulates SOC1 and
AGL24. Interestingly, when we analyzed XAL2 accumulation in
the SOC1 (agl20-101D; Lee et al., 2000) and AGL24 (Yu et al.,
2002) overexpression lines, we found that XAL2 was strongly
repressed only in agl20-101D (Figure 2F and 2G). This suggests
that XAL2 and SOC1 overexpression lines induce early
flowering independently of one another.
In summary, the RT–PCR results indicate that XAL2 is an impor-
tant component of the GRN that regulates AP1, is under the
control of CO, and participates in the up-regulation of SOC1
and AGL24 upon floral transition.
XAL2 Participates in Flower Meristem Maintenance and
Determinacy
To address the role of XAL2 in FM development, we analyzed its
spatio-temporal expression pattern with in situ hybridization at
different FM stages. XAL2 expression appears very early at the
flanks of the IM in the anlagens upon the transition to flowering
(Figure 3A). Subsequently, XAL2 expression levels increase in
the first and second stages of the FM (Smyth et al., 1990), and
Figure 2. XAL2 Regulation in the Flowering Gene Regulatory
Network (GRN) under LD condition.
(A) XAL2 positively regulates AP1.
(B) LFY is up-regulated by XAL2 only in the overexpression line.
(C and D) SOC1 (C) and AGL24 (D) are down-regulated in xal2-2, and are
repressed in the XAL2 overexpression line.
(E) CO positively regulates XAL2.
(F) Overexpression of SOC1 represses the expression of XAL2, but
no significant difference in the latter was observed in soc1-6 with respect
to WT.
(G) AGL24 does not regulate XAL2.
Relative mRNA accumulation from three biological replicates were ob-
tained from 14 DAS seedlings (blue bars) and 10 DAS plants (red bars)
grown under LD condition. Data are shown as mean ± standard error.
Statistical significance (**P < 0.01) was evaluated using theMann–Whitney
test.
800 Molecular Plant 8, 796–813, May 2015 ª The Author 2015.
Molecular Plant XAANTAL2 (AGL14) in Arabidopsis SAM Transitions
at stage 3, XAL2 is restricted to the L1 and L2 layers (Figure 3A
and 3B). Later on, XAL2 is expressed in the gynoecium and
stamen primordia at stage 6 (Figure 3B). Interestingly, XAL2
mRNA is also detected in the IM periphery (Figure 3B). We
used xal2-2 mutant as a negative control to rule out cross-
hybridization of our probe with the closely related AGL19
mRNA (Figure 3C).
The XAL2 spatio-temporal expression pattern is similar to that of
AGL24 and SVP (Yu et al., 2004; Liu et al., 2007; Gregis et al.,
2009). It has been reported that SOC1, in addition to the latter
two genes, are important during the first two stages of FM
development, but are repressed at stage 3 for proper
subsequent FM differentiation (Yu et al., 2004; Gregis et al.,
2006; Liu et al., 2007; Gregis et al., 2009; Liu et al., 2009).
Furthermore, LFY and AP1 (particularly the latter) repress
AGL24 and SVP at FM stage 3 (Yu et al., 2004; Liu et al., 2007).
Coincidentally, we found that XAL2 accumulation is higher in
the ap1-15 (Ng and Yanofsky, 2001) and ap1-1 cal-5 mutants
(Ferrándiz et al., 2000) and is down-regulated in the tfl1-2mutant,
in which the IM is converted into FM (Supplemental Figure 3A;
Shannon and Meeks-Wagner, 1991; Alvarez et al., 1992). In
agreement, an opposite pattern of expression for LFY was
detected in these mutants (Supplemental Figure 3A). Therefore,
AP1 and CAL probably repress XAL2 in the FM at stage 3, as
occurs with AGL24 and SVP.
As already explained, XAL2 overexpression induces early flower-
ing with the production of very few rosette leaves (Figure 1A and
1E). It is noteworthy that cauline leaves in these lines are rounder
and larger, similar to rosette leaves (Figure 5A), and flowers have
leaf-like traits, such as large sepals that remain indehiscent after
fertilization (Figures 3D and 5D), and sepal cells with a
morphology reminiscent of wild-type leaf cells (Figure 3E–3G).
These phenotypes are similar to those reported for SOC1,
AGL24, and their homolog overexpression lines (Borner et al.,
Figure 3. XAL2 Spatial and Temporal Expression in WT SAM during the Transition to Flowering and XAL2 Overexpression Floral
Phenotypes.
(A–C)mRNA XAL2 in situ hybridization inWT and xal2-2 inflorescences. (A) XAL2 is detected in the anlagen (a), stage 1 of the floral meristem (FM), and L1
and L2 layers of FM stage 3. (B) XAL2 accumulates at stage 2 of the FM and at the inflorescencemeristem (IM). Later in FMdevelopment (stage 6), XAL2 is
also detected in the stamen (st) and gynoecium (g) primordia. (C) As a negative control, no signal was detected when XAL2 antisense probe was used in
the xal2-2 mutant.
(D–G) Floral phenotype of the 35S::XAL2 compared with WT grown under LD condition. (D) 35S::XAL2 sepals are larger than WT sepals, and scanning
electron micrographs show that the cellular identity of the 35S::XAL2 sepals (F) is more similar to WT leaf cells (G) than toWT sepal cells (E), including the
presence of trichomes (F).
(H) Under SD condition, early arising carpels of the 35S::XAL2 plants elongate and inflorescences develop inside them.
(I–K) Longitudinal toluidine blue-stained sections confirmed that flowers at different developmental stages can be observed inside the 35S::XAL2 carpels
compared with a similar stage of WT carpels based on ovule development (I). (K) Magnification of the rectangle in (J) shows a FM at stage 4 of
development (yellow arrowhead) and pollen grains (red arrowheads) inside the XAL2 overexpression carpel.
Scale bars correspond to 50 mm (A–C), 20 mm (E–G), and 500 mm (I–K).
Molecular Plant 8, 796–813, May 2015 ª The Author 2015. 801
XAANTAL2 (AGL14) in Arabidopsis SAM Transitions Molecular Plant
2000; Michaels et al., 2003; Ferrario et al., 2004; Masiero et al.,
2004; Yu et al., 2004; Liu et al., 2007; Trevaskis et al., 2007;
Fornara et al., 2008). Interestingly, in the 35S::XAL2 under
SD condition, early arising basipetal carpels deformed (stages
14–17 of flower development; Smyth et al., 1990; Roeder and
Yanofsky, 2006) and a whole inflorescence grew from inside,
bearing new fertile flowers (Figure 3H, 3J, and 3K). The flowers
that develop from these indeterminate carpels attain different
developmental stages, from FM (stage 4 in the picture) up to
flowers with mature ovules and pollen grains (Figure 3I and 3K).
These results indicate that correct spatio-temporal control of
XAL2 expression is fundamental for normal FM cell differentiation
and determinacy.
In conclusion, XAL2 overexpression accelerates the transition to
flowering by terminating the vegetative phase prematurely, but at
the same time the flowers produced in these lines show leaf-like
traits. In addition, under non-inductive flowering conditions, over-
expression of XAL2 prevents FM determinacy, leading to what
appears to be cell reprogramming with some carpel cells func-
tioning as IM cells.
XAL2 Overexpression Positively Regulates TFL1
and WUS Expression, and Directly Binds
to the TFL1 Regulatory Sequences
To unravel the molecular basis of the XAL2 overexpression
phenotypes in which this gene up-regulates AP1 and LFY
(Figure 2A and 2B), and at the same time yields flowers with
some ap1 mutant characteristics (Figure 3D and 3F; Irish
and Sussex, 1990; Bowman et al., 1993), we hypothesized
that an inflorescence identity gene capable of repressing AP1
could be involved. Hence, we analyzed the expression of TFL1
in xal2-2 and the 35S::XAL2 line, and found that it was down-
and up-regulated in these lines, respectively (Figure 4A).
Furthermore, to establish whether XAL2 is able to directly bind
to TFL1 regulatory sequences, we performed a chromatin
immunoprecipitation (ChIP) experiment using a 35S::GFP-XAL2
line. In Figure 4B we show that three different TFL1 regulatory
regions containing CArG boxes are enriched (III, V, and VI)
within the 50 promoter and the intergenic region downstream of
the 30 stop codon of the TFL1 gene. These results strongly
support that under constitutive expression, XAL2 directly binds
to TFL1 regulatory sequences. Since SOC1 and AGL24
overexpression phenotypes are similar to those of 35S:XAL2,
we further analyzed TFL1 transcript accumulation in these
two lines. TFL1 was also up-regulated in agl20-101D and
35S::AGL24 (Figure 4C).
We have already shown that AP1 is up-regulated in the XAL2
overexpression line at 10 DAS (Figure 2A), but we wanted to
be sure that the ap1-like phenotype was not due to its down-
regulation at different developmental stages. We performed an
AP1 expression time course from 8 to 14 DAS plants, and at
all time points analyzed 35S:XAL2 plants showed higher levels
of AP1 than wild-type plants (Supplemental Figure 3B). Hence,
we can conclude that leaf-like traits of the 35S::XAL2 flower or-
gans are not due to decreased levels of AP1. Also, SOC1 and
AGL24 overexpressors showed up-regulation of AP1 as ex-
pected (Figure 4D). Thus, overexpression of XAL2, SOC1, or
AGL24 is able to up-regulate both TFL1 and AP1 and cause
early flowering, and at the same time yield flowers with leaf-
like cell types.
FM cells stop proliferating after the carpels are formed in wild-
type flowers (Mizukami and Ma, 1997). WUSCHEL (WUS), a key
gene involved in the identity of stem cells in the SAM, is
repressed in the central zone of the FM at stage 6 (Lenhard
et al., 2001; Lohmann et al., 2001; Sun et al., 2009). Since we
observed that in non-inductive flowering conditions FM determi-
nacy is lost and a new inflorescence emerges from inside of early
arising carpels in the XAL2 overexpression lines (Figure 3H, 3J,
and 3K), we hypothesized that WUS is persistently expressed in
these lines. Therefore, we analyzed WUS and TFL1 mRNA
accumulation in these carpels compared with wild-type carpels
at similar developmental stages. Indeed, we found that both
genes were up-regulated in the XAL2 overexpression line
(Figure 4E and 4F), confirming that when XAL2 is de-regulated,
some of the molecular components that are important for IM
and FM identity and determinacy are also altered.
To test whether the floral ‘‘reversion’’ phenotype of the 35S::XAL2
line was due to higher competence of TFL1 over AP1, we crossed
it to a 35S::AP1 plant to test whether the excess of AP1 could
counteract TFL1 (Figure 5; Mandel and Yanofsky, 1995). As
expected, both lines partially complemented each other’s
phenotypes in the double overexpressor line grown under SD
condition, resulting in plants with smaller cauline leaves and
flowers with reduced sepals compared with the 35S::XAL2
parental plant (Figure 5A, 5C, 5D, and 5F). On the other hand,
the conversion of inflorescences into solitary flowers, typical of
the 35S::AP1 line, disappeared (Figure 5B and 5C). Although
both parental lines were early flowering, the bolting time of
the 35S::AP1 35S::XAL2 line was the same as for 35S::AP1
plants, but the double overexpressor line had an intermediate
number of rosette leaves with respect to both parental lines
(Figure 5G). Interestingly, the double overexpressor had fewer
swollen carpels compared with the XAL2 overexpression line
(Figure 5H), indicating that the indeterminacy observed in the
35S::XAL2 FM (Figure 3H, 3J, and 3K) was almost recovered
when AP1 was increased.
GRN and EL Modeling for XAL2 Interactions under LD
Condition: A Mechanistic Dynamic Explanation for
XAL2, SOC1, and AGL24 Overexpression Phenotypes
Our data uncover a complex set of interactions and roles for XAL2
in SAM transitions. To provide an integrative, system-level,
dynamic and mechanistic explanation for our results, a GRN
modeling approach is required. We integrated the evidence
of this work, together with previously reported information
(Supplemental Table 2), to uncover a necessary and sufficient
set of components and interactions (i.e., dynamic GRN module)
that recover observed patterns of gene expression in vegetative
meristem (VM), IM, and FM cells in wild-type plants. These
patterns correspond to the expected set of steady-state gene
configurations to which such wild-type GRN should converge
(Figure 6 and Table 1), and can be validated if it also explains
what we observe in the analyzed loss and gain of function lines.
We formalized experimental data as logical functions following
previous studies (see the Methods section and Supplemental
802 Molecular Plant 8, 796–813, May 2015 ª The Author 2015.
Molecular Plant XAANTAL2 (AGL14) in Arabidopsis SAM Transitions
Table 3). We were able to recover a wild-type GRN model under
LD condition that integrates the data presented in this work and
the necessary and sufficient set of additional components and in-
teractions from the literature, to recover the expected steady
states for VM, IM, and FM for the genes considered (Figure 6C
and Table 1). Loss of function lines were simulated by turning
the corresponding gene to ‘‘0’’ during the complete simulation,
while the overexpression lines were simulated by turning the
corresponding gene to ‘‘2.’’ The proposed GRN is validated
because, as expected by the observed phenotypes, all single
loss of function mutants qualitatively recovered the same set of
steady states as wild-type GRNs, while gain of function GRN
Figure 4. XAL2 is a Positive Regulator of TFL1.
(A) TFL1 relative RNA accumulation is down-regulated in xal2-2 and up-regulated in the XAL2 overexpression line.
(B) Chromatin immunoprecipitation (ChIP) assay was performed to examine in vivo binding of XAL2 to the TFL1 regulatory regions in the XAL2
overexpression line. Top panel shows a schematic diagram of the TFL1 locus, indicating in roman numerals the regions amplified by PCR after DNA
immunoprecipitation. Primers flanking the CArG boxes (+) and their positions relative to the TFL1 transcriptional start site are indicated. The bottom panel
shows DNA fragments corresponding to TFL1 III, V, and VI regions enriched in the 35S::GFP:XAL2 plants after ChIP with a GFP antibody.
(C and D) Up-regulation of TFL1 (C) and AP1 (D) in the SOC1 (agl20-101D) and 35S::AGL24 overexpression lines.
(E and F) Higher RNA accumulation levels of both TFL1 (E) and WUS (F) were detected in first arising carpels of the 35S::XAL2 plants compared with
WT plants grown under SD condition.
Quantitative RT–PCR was performed with RNA extracted from 14 DAS seedlings (blue bars), 10 DAS seedlings (red bars), and carpels with similar ovule
stage development (green bars). Data in (A) and (C–F) are shown as mean ± standard error. Statistical significance (*P < 0.05, **P < 0.01) was evaluated
using Mann–Whitney test.
Molecular Plant 8, 796–813, May 2015 ª The Author 2015. 803
XAANTAL2 (AGL14) in Arabidopsis SAM Transitions Molecular Plant
A TFLl B TFLl (At5g03840) 
= 6 ** .~ 
'" 5 
3.9 I n In IV V VI 
'" ~ - - - -
a 4 -680 to -336 1924 to 2112 2792 to 3008 
~ 3 
-1751 to -1 539 -355to-31 
~ 2 * 
+i III 
+ t .~ + + + I +4315 to 4539 .... 1 ~ 1000 bp 
Q:j o AtSg03837 
~ ~~ 8/'1,. ~~'1,. ~ iJ 
",~S· ' 
nI V VI 
GFP-antibody I e 
TFLl 
(+) 
= I .~ 6 ** 
GFP-antibody 
'" '" 5 (-) 
~ 4.0 
'"' Input I ' I ~ 4 * 
~ 3 
~ 
2 ~~~'1,. .~ .... 
1 ~ 
Q:j 
O ~ 
~~ í::J\{) ~0V1' 
.0~ 
",~S·' 
í::J;\. 
S·~ o-3;'V ","> . 
D APl E 
TFLl ** F 
WUS ** 
= = 5.4 = 31.0 
.~ 15 ** .~ 6 .~ 40 ~ '" 11.6 '" I '" ~ 5 ~ ~ 30 
alO ** 
a4 Q. 
~ ~ 3 ~ ' // 
~ 1.0 
~ 
5 ~ 2 
~ 
1 .~ 1.0 .~ .... ~ 1 
.... 
~ ~ 
Q:j O Q:j O Q:j O 
~ 
~í::J\{) ~0V1' 
~ 
~~ ')., ~ 
~ 
')., 
S~ S~ í::J;\. 
o-3;'V S·~ ~. ","> . 
~. 
'" '" 
simulations recovered wild-type steady states, plus a new steady
state with an IM/FM mixed cell identity (Figure 6C and Table 1).
Indeed, experimental data has shown that soc1, agl24, and xal2
single mutants do not modify cell identities but only flowering
time, which cannot be simulated with this version of the model.
On the other hand, XAL2, SOC1, or AGL24 overexpression not
only modifies flowering time, but also produces some flowers
with inflorescence-like characteristics. Coincidentally, our model
suggests that such flowers have some cells with a mixed IM/FM
identity.
To gain further insight into how the alteration in the expression of
SOC1, AGL24, and especially XAL2 modify SAM cell transitions,
we propose an EL analysis similar to that reported by Álvarez-
Buylla et al. (2008) (Figure 7 and Supplemental Figure 4).
Such analysis addresses whether the set of components
and interactions considered in the uncovered GRN module in
Figure 6B also underlies the observed temporal pattern of
Figure 5. Complementation Analysis of the
35S::XAL2 and the 35S::AP1 Phenotypes in
the Double Overexpressor Plants Grown
under SD Condition.
(A and D) XAL2 overexpression plants have large
cauline leaves similar to rosette leaves (A) and
flowers with large sepals that persist after fertil-
ization (D).
(B and E) 35S::AP1 plants show a determinate
growth in which each pedicel gives rise from two
to three terminal flowers (B). Flowers of the
35S::AP1 are similar to those of WT (E).
(C and F) Determinate growth of the 35S::AP1 line
is complemented in the double overexpressor
35S::AP1 35S::XAL2 plants. On the other hand,
the cauline leaves phenotype of the 35S::XAL2
is complemented to WT in this line (A–C). Sepals
of the double overexpressor line are partially
complemented, resulting in sepals that are much
smaller than the 35S::XAL2 sepals (D and F).
Scale bars correspond to 1 cm (A–C) and 2 mm
(D–F).
(G and H) The double overexpressor plants (G)
have the same bolting time as the 35S::AP1 line,
but have an intermediate rosette leaf number
compared with parental plants. The number of
indeterminate carpels (NIC) along the shoot axis of
the double overexpressor (H) is also reduced
compared with the 35S::XAL2 line. Bars corre-
spond to standard error from average (n = 26–32
plants). Statistical significance with respect to
parental plants (***P < 0.001; red asterisks) was
evaluated according to one-way ANOVA following
Tukey’s multiple comparison test (G) or Mann–
Whitney test (H).
transition among cell types in wild-type
and other lines (steady states): VM > IM >
FM. Importantly, this type of model can
discriminate between two hypotheses: the
observed leaf-like structures in flowers of
the overexpressors is due to a reversion
from FM cells to IM cells, or in these lines a
new type of steady state withmixed identity (IM/FM) appears dur-
ing SAM development. Thus, this and the GRN modeling provide
a mechanistic explanation for the apparently paradoxical pheno-
type of the overexpressors.
We thus performed a stochastic simulation of the proposed GRN
model to propose a model for a population of cells at the SAM
(see Supplemental Methods). Since VM cells are the first to
attain their fate in wild-type, all cells were assumed to be in this
state at initial conditions. Thus, in the vector with the proportion
of cells in each GRN steady state for the dynamic stochastic
equation, VM was set to 1 and the rest to 0 (Figure 7A–7D). This
equation was iterated to follow the changes in the probability
of reaching each one of the other steady states over time. The
graph clearly shows how the trajectory for each of the steady
states’ probability reaches its maximum at a given time. In
accordance with biological observations, the results show that
the most probable sequence of cell attainment is VM > IM > FM
804 Molecular Plant 8, 796–813, May 2015 ª The Author 2015.
Molecular Plant XAANTAL2 (AGL14) in Arabidopsis SAM Transitions
A 35S::XAL2 B 
G 
35S::APl e 
H 
6 
5 
4 
;:: 3 
z 
2 
35S::APl 
35::XAL2 
1.3 
in wild-type plants (Figure 7A). In conclusion, our simulations
suggest that the complex GRN that underlies the attainment of
VM, IM, and FM cell identities also restricts, to a large extent,
the temporal pattern of transitions among them as found for the
floral organ specification GRN reported by Álvarez-Buylla et al.
(2008).
Interestingly, in the case of gain of function simulations of XAL2,
SOC1, and AGL24, the same pattern of temporal transitions as
in wild-type was recovered, but in these cases the maximum
probability of the mixed IM/FM identity occurs after the IM and
before the FM configurations (Figure 7B–7D). This analysis also
recovers all the possible transitions among the steady states
(Figure 7E and 7F). The net transition rate was positive for the IM
to FM direction in all the lines tested, but was lower under gain
of function lines in comparison with the wild-type (Supplemental
Figure 4). This means that the net probability flow preferentially
follows the direction from IM to FM, both in wild-type and in
each of the overexpression lines of XAL2, SOC1, and AGL24
(Figure 7F). These results are consistent with the observed most
probable temporal order of transitions in plants. Likewise, the
results do not support the hypothesis of an induced, reverse
Figure 6. Model for the XAL2 Regulatory Network Module during SAM Development and Its Steady States for the WT, Loss and Gain
of Function XAL2, SOC1, and AGL24 Lines.
(A) Schematic representation of SAM transitions from a vegetative (VM) to inflorescence (IM) and floral (FM) meristem states.
(B) GRN showing the interactions uncovered in this paper and published results (see Supplemental Table 2). Arrows (green) and bar-lines (red) indicate
induction and repression, respectively. In some cases, we discovered that the sign of the interaction inferred changed depending if the loss or gain of
function lines were being tested (regulation of XAL2, SOC1, and AGL24 over some of their targets). Dotted lines represent predictions of regulations that
need further verification. In the case of GA and CO, the positive feedback loops are introduced because their upstream regulators that keep them turned
on were not considered in the model proposed here. AP1 plus SOC1 or AGL24 indicate protein dimers that repress TFL1 (Liu et al., 2013).
(C) A schematic representation of the network in (B) is used to represent the steady states achieved by this model under the different lines considered
(columns). In each row, the steady states corresponding to the VM, IM, FM, or the novel IM/FM state recovered in the overexpressors. The components of
the network are shown by squares or a circle (GA) that are turned on/off in each of the steady-state configurations being considered. The colors
correspond to the activation state of the node in each case: red = 0; green = 1; yellow = 0 or 1; purple = 1 or 2; light blue = 2; and dark blue = 0, 1, or 2.
Molecular Plant 8, 796–813, May 2015 ª The Author 2015. 805
XAANTAL2 (AGL14) in Arabidopsis SAM Transitions Molecular Plant
A 
e 
B 
~------- ~ ~ --------~ 
\ -' 
WT xal2 XAL2-0E soc1 SOCI-OE agl24 AGL24-0E 
VM -- ~ - B-- G- = --=- a--E-- §- ----- ---- ---- ---- -- ¡;::¡ - ---- ---= 
'-------_e -------, . • • • • • 
rate of transition fromFM to IMor IM/FMcells as an explanation of
the observed phenotype in the overexpressors, as both reverse
transitions (FM to IM and FM to IM/FM) showed a negative net
transition rate (Figure 7E). Overall, the results of the stochastic
EL analysis suggest that instead of an accelerated rate of
transition in the forward (IM to FM) direction, it is the novel
potentiality of the IM state to now choose between two
preferential (positive net transition rate) fate decisions (FM or IM/
FM phenotypes) induced by gene overexpression that accounts
for the observed promiscuous IM/FM state in such flowers
(Figure 7F and Supplemental Figure 4).
DISCUSSION
In this work we have shown, in contrast to previous expectations
(Schönrock et al., 2006; Garay-Arroyo et al., 2013), that XAL2 is
expressed in the IM and FM and is a key player in the complex
GRN underlying SAM transitions (Figure 6). XAL2 is a promoter
of flowering in response to multiple signals and is also
important for FM maintenance and determinacy. We propose a
GRN and EL modeling approach that together provides a
mechanistic dynamic framework to explain the role of XAL2
at the SAM and the apparently paradoxical phenotypes of
its overexpression. Moreover, such a modeling framework
constitutes a systemic mechanistic explanation for the
observed patterns of expression of multiple genes underlying
VM, IM, and FM cell fates, and the observed transitions among
them in wild-type Arabidopsis. It thus constitutes a useful frame-
work to incorporate additional components and interactions
that participate in SAM development. Finally, it provides an
explanation for AGL24, SOC1, and their homolog overexpression
phenotypes in Arabidopsis (Borner et al., 2000; Michaels et al.,
2003; Ferrario et al., 2004; Masiero et al., 2004; Yu et al., 2004;
Liu et al., 2007; Trevaskis et al., 2007; Fornara et al., 2008).
XAL2 Promotes Flowering Transition
XAL2 participates in flowering transition in response to more than
one signal, having a higher impact under non-inductive photope-
riod conditions (Figure 1C). Flowering time is not so clearly
affected in the xal2 alleles, under all conditions tested, as is
soc1, probably because SOC1 and AGL24 are able to directly
activate LFY independently of XAL2 (Lee et al., 2008; Liu et al.,
2008). We proved that CO positively regulates XAL2 and that
the latter positively regulates SOC1 and probably AGL24
(Figure 2C–2E). Being soc1 epistatic over xal2 under LD
condition confirms this result (Figure 1B). We also proved that
under SD condition, and in response to GA, xal2 is affected in
bolting time and xal2-2 soc1-6 has an additive effect compared
with the parental plants (Figure 1C and 1D and Supplemental
Table 1). These results could imply that they act independently
over LFY and AP1 regulation, or that they are part of the same
regulatory module. We argue that XAL2 is probably part of the
same GRN in which SOC1 participates, integrating at least
some of the flowering transition pathways in response to
different signals. In fact, the spatial and temporal patterns
of expression of XAL2, and its loss and gain of function
phenotypes, resemble those corresponding to SOC1 and
AGL24 lines (Borner et al., 2000; Yu et al., 2004; Liu et al.,
2007; Gregis et al., 2009), thus suggesting that XAL2 is part of
the SOC1–AGL24 regulatory module. Moreover, XAL2 interacts
with SOC1 and AGL24 according to yeast two-hybrid data
(de Folter et al., 2005; Immink et al., 2009).
XAL2 Overexpression Affects FM Maintenance and
Determinacy by Up-Regulating TFL1 and WUS
After the flowering transition, LFY, AP1, and CAL are necessary
for FM identity (Weigel et al., 1992; Bowman et al., 1993;
Ferrándiz et al., 2000) by repressing the IM genes, particularly
TFL1 (Shannon and Meeks-Wagner, 1991; Schultz and Haughn,
1993; Gustafson-Brown et al., 1994; Mandel and Yanofsky,
1995; Liljegren et al., 1999). During the first and second stages
of FM development, SOC1, AGL24, and SVP maintain FM
identity in collaboration with AP1 by repressing AG and SEP3
(Gregis et al., 2006, 2009; Liu et al., 2009). At stage 3 of flower
development, LFY and AP1 repress the expression of the
‘‘flowering genes,’’ allowing the transcription of the floral organ
identity genes (Yu et al., 2004; Liu et al., 2007). LFY and WUS,
among other genes, induce the expression of AG during this
stage, which in turn represses WUS at stage 6, together with
other proteins (Lenhard et al., 2001; Lohmann, et al., 2001;
Gómez-Mena et al., 2005; Lee et al., 2005; Sun et al., 2009; Sun
and Ito, 2010; Liu et al., 2011). This event drastically affects the
FM stem cells, which stop proliferating (Mizukami and Ma, 1997).
These experimental data indicate that certain genes have
clear effects in the FM when their expression is depleted or
augmented; however, we think that FM identity, maintenance,
and determinacy emerge from a complex GRN in which spatio-
temporal regulations of SOC1, AGL24, SVP, and XAL2 are also
important. Indeed, in this study we have shown that overexpres-
sion of XAL2 affects FM maintenance and yields phenotypes
similar to those reported for the overexpression lines of SOC1,
AGL24, and their homologs (Borner et al., 2000; Michaels et al.,
2003; Ferrario et al., 2004; Masiero et al., 2004; Yu et al., 2004;
Liu et al., 2007; Trevaskis et al., 2007; Fornara et al., 2008).
More importantly, we demonstrate that overexpression of any
of these genes is sufficient to induce TFL1 expression
(Figure 4A and 4C), suggesting that mis-regulation of TFL1 under-
lies the ‘‘leaf-like’’ flower phenotype observed in the overexpres-
sion of these three MADS-box genes. In this regard, Hanano and
AP1 LFY SOC1 AGL24 XAL2 TFL1 FT GA CO
VM 0 0 0 0 0 1 0 a 0
IM 0 0 a a a 1 a a a
FM 1 1 a a a 0 a a a
Table 1. Observed Expression States of the Genes Considered in the NetworkModel in Wild-Type Plants during Different Stages of the
SAM Development.
aAny possible value of the node in the network.
806 Molecular Plant 8, 796–813, May 2015 ª The Author 2015.
Molecular Plant XAANTAL2 (AGL14) in Arabidopsis SAM Transitions
Goto (2011) had demonstrated that TFL1 acts as a transcriptional
repressor, and the 35S::TFL1-SRDX line phenotype reported by
these authors is, in fact, very similar to the XAL2 phenotype re-
ported here (Figure 3).
Overexpression of SOC1, AGL24, or XAL2 genes affects SAM
transitions, causing premature flowering and LFY/AP1 up-
regulation (Figures 2, 4, and 5). At the same time, we have
proved that overexpression of these MADS-box genes induces
higher levels of TFL1 mRNA accumulation compared with
wild-type plants (Figure 4A and 4C). Furthermore, we have
shown that XAL2 directly binds to TFL1 regulatory sequences
using the overexpression line 35S::GFP:XAL2 (Figure 4B).
Interestingly, one of these binding sites (fragment V of the TFL1
30 region amplified in our ChIP assay) corresponds to one of the
binding sites of AP1, which has been demonstrated to be
important for direct repression of TFL1 (Kaufman et al., 2010).
More recently, it was demonstrated that SOC1, AGL24, SVP,
and SEP4 cooperate with AP1 in this action (Liu et al., 2013).
However, it is possible that, when overexpressed, higher ratios
of XAL2, SOC1, or AGL24 over AP1 are able to compete for the
same binding site, affecting TFL1 transcription in an opposite
way. The partial complementation of the vegetative and
indeterminacy features of the 35S::XAL2 line by crossing it with
35S::AP1 supports this hypothesis (Figure 5). If TFL1 and,
probably, other genes important for IM identity are ectopically
expressed in the FM, this would explain the inflorescence
characteristics of those flowers even in the presence of AP1
which is not down-regulated (Figures 2 and 4; Supplemental
Figure 3), and probably not mis-localized either, as reported
for AGL24/SVP homolog OsMADS47 overexpression line
(Fornara et al., 2008). In this sense, the FM does not change its
identity through a floral reversion process. Instead it behaves
differently, probably having a mixed IM/FM identity, due to an
altered behavior of the GRN (Figures 6 and 7).
Heterochronic ‘‘floral reversion’’ has been shown to be depen-
dent on light and gibberellin signaling that affects a signal com-
ing from the leaves to the SAM (Okamuro et al., 1996; Hempel
et al., 2000). We now know that this signal is FT (Jaeger and
Wigge, 2007; Müller-Xing et al., 2014). During flowering
transition, this protein competes with TFL1 for FD, and this
association up-regulates SOC1 and AP1 in the anlagen (Abe
et al., 2005; Wigge et al., 2005; Hanano and Goto, 2011;
Jaeger et al., 2013). In the overexpression lines of XAL2,
SOC1, and AGL24, up-regulation of TFL1 or delayed expression
of FT under SD condition would affect such balance until
endogenous FT protein attains certain levels during Arabidopsis
inflorescence development. This would explain why the acrop-
etal flowers show a wild-type phenotype while the early ones
show IM features. This and related hypotheses could be tested
by expanding the dynamic GRN and EL modeling framework
proposed here.
Early Flowering and FM Phenotypes of the XAL2
Overexpression Line under LDCondition are Reconciled
Using a GRN Model and EL Analysis
We proposed GRN and EL models that provide a framework for
mechanistic explanations of SAM transitions in wild-type plants,
but also the complex loss and gain of function phenotypes of
XAL2 and other regulators of SAM transitions. In particular, this
provides a novel framework with which to evaluate floral rever-
sion. Floral reversion has been defined as the reappearance
of vegetative traits during flower development or the loss of FM
determinacy after floral organs are formed. This uncommon
process in Arabidopsis has been attributed to reversion of the
FM to the IM identity, particularly in the lfy and ap1 mutants
(Battey and Lyndon, 1990; Okamuro et al., 1993, 1996, 1997;
Tooke et al., 2005). In contrast, based on previous data and
the experimental results summarized here, we postulate an
alternative explanation for the so-called floral reversion in the
case of the SOC1, AGL24, and XAL2 overexpression lines.
Our results of the deterministic GRN model suggest that a mixed
meristem identity is attained as a steady state when XAL2,
AGL24, or SOC1 are overexpressed, while the same GRN yields
normal configurations when the same genes are kept to ‘‘0.’’
Indeed, based on our experimental data, in the model the IM/
FM identity is the result of the positive regulation of these three
MADS-box genes over TFL1, LFY, and AP1. When either of these
genes is overexpressed, TFL1 and AP1 or LFY are activated,
while at the same time the multiple feedback loops among
them stabilize their expression, thus yielding the IM/FM identity
(Figures 6 and 7).
The EL simulations suggest a mechanism by which 35S::SOC1,
35S::AGL24, or 35S::XAL2 cause a fraction of the cell population
at the IM to acquire a mixed IM/FM identity (Figure 7). This
could be explained by two alternative hypotheses. During
normal developmental VM > IM > FM transitions, a fraction
of cells may attain the new mixed identity IM/FM. Under this
circumstance the establishment of the antagonistic relationship
between IM and FM regulators may be weakened. On the other
hand, an induced, reverse rate of transition from FM to IM or
IM/FM cells could account for the results. The modeling results
show that the first one is the most probable one, and the
overexpressor global transition pattern is: VM > IM > IM/FM >
FM (Figure 7F and Supplemental Figure 4). Therefore, for this
and similar cases the term ‘‘floral reversion’’ should be avoided.
Loss of FM Determinacy in XAL2 Overexpression Lines
under SD Condition
Constitutive expression of XAL2 also affects floral determinacy
under SD condition. Here we showed that under this condition
new inflorescences develop from the inside of the carpels of
the basipetal flowers (Figure 3H, 3J, and 3K). At the molecular
level, this may be explained in two different ways: either
the presence of XAL2 prevents WUS repression or ectopic
expression of this gene is sufficient to up-regulate WUS. We
observed that WUS expression in the 35S::XAL2 is maintained
after stage 6, enabling stem cells to remain active (Figures 3J,
3K, and 4F). At this point, we cannot know if the FM
maintenance and indeterminacy phenotypes observed in the
overexpression lines of XAL2, SOC1, or AGL24 are due to a
dominant negative effect or to gain of function. Interestingly,
overexpression of XAL2 or SOC1 represses each other
(Figure 2C and 2F), indicating that in these lines altered protein
complexes could be formed. These hypotheses can be tested
using an expanded GRN module including additional SAM
genes. Furthermore, such a model could address whether FM
Molecular Plant 8, 796–813, May 2015 ª The Author 2015. 807
XAANTAL2 (AGL14) in Arabidopsis SAM Transitions Molecular Plant
Figure 7. Epigenetic Landscape Analysis for the XAL2 Regulatory Network Module.
(A–D) Temporal sequence of cell-fate attainment pattern under the stochastic GRNmodel during SAM cell-fate transitions. Themaximum probability p of
attaining each attractor, as a function of time (in iteration steps) is shown for (A)WT, (B) XAL2 overexpression (XAL2-OE), (C) SOC1-OE, and (D) AGL24-
OE. Vertical lines mark the time at which maximum probability of each steady state (i.e., cell fate, VM, IM, FM, or IM/FM) is attained. Note that the
maximum probability for each steady state is 1. Themost probable sequence of cell-fate attainment for theWT is VM, IM, FM; and for OE lines VM, IM (IM/
FM), FM. The value of the error probability used in this case was x = 0.05. The same patterns were obtained with error probabilities from 0.01 to 0.1 (data
not shown).
(E) Schematic representation of the possible transitions between pairs of steady states (cell fates at the SAM) for WT and OE lines. Arrows indicate the
directionality of the transitions. Above each arrow a sign (+) or () indicates whether the calculated net transition rate between the corresponding
(legend continued on next page)
808 Molecular Plant 8, 796–813, May 2015 ª The Author 2015.
Molecular Plant XAANTAL2 (AGL14) in Arabidopsis SAM Transitions
A WT 8 XAL2-0E 
VM 1M FM VM 1M IM-FM FM 
........ ............ 00 --- 00 
c:i c:i 
~ « --- ........ 
a.. '<t ce '<t .. -.. ~ _ .. ..:. : :.: .. -
c:i c:i 
---o o 
c:i c:i 
5 10 15 20 5 10 15 20 
t 
e SOCI-OE 
D 
AGL24-0E 
VM 1M IM-FM FM VM 1M IM-FM FM 
00 00 
c:i c:i 
« ---........ « ---.. ........ 
ce '<t .. -.. ~ - " ,:, "- " -=. .. - .. - .. ce '<t ..-.. --... .:, ... .. _ .. - .. _ ... 
c:i c:i 
o o 
c:i c:i 
5 10 15 20 5 10 15 20 
t t 
E WT OELINES 
8 · ~ "@ 8 ~ 
+ 
@ -
~ ' \ # { + " 1 - ' + : 
"" - : + 
- + , I 
" f 
8 @' + 
~ S F FM E -
dIM -+ JMI FM > o 
WT 
FM 
OELINES 
FM 
to IM transition in the indeterminate carpels, which corresponds
to cell reprogramming, is favored under XAL2 or other MADS-
box overexpression.
METHODS
Plant Material and Selection of Mutant Lines
Arabidopsis thaliana wild-type and mutant plants used in this study
were Col-0 with the exception of ap1-1 cal-5 and tfl1-2, which are in Ler
ecotype. Mutant alleles xal2-1 and xal2 were described previously
(Garay-Arroyo et al., 2013). The soc1-6 (SALK_138131; Wang et al.,
2009), ful-7 (SALK_033647;Wang et al., 2009), and agl24-4 (GK674F05.03/
N385337) mutant seeds were provided by the Arabidopsis Biological
Resource Center or the Nottingham Arabidopsis Stock Centre, and
the homozygous alleles were selected using the primers shown in
Supplemental Table 4.
Plant Growth Conditions and Flowering Time Measurements
Seedlings were grown on vertical plates with 0.23 Murashige and Skoog
(MS) medium (Murashige and Skoog, 1962) containing 1% sucrose. For
flowering experiments, plants were grown on soil (Metromix 200) under
LD (16 h light/8 h dark) or SD (8 h light/16 h dark) condition at 22C. For
GA3 treatment, plants were grown under SD condition for 2 weeks
before they were sprayed with 100 mM GA3 twice a week until flowering.
For vernalization experiments, seeds were plated on MS medium and
kept in the dark for 8 weeks at 4C and then transferred to soil and
grown under LD condition. Flowering transition was measured as
bolting time (days after seed sowing required for the stem to grow to
1 cm long) and by the RLN at bolting. Inflorescences for in situ
hybridization were collected when the stem reached 10 cm long. These
comprised FM at different developmental stages.
Plasmid Constructs and Plant Selection
The XAL2 gene was amplified from Col-0, using the XAL2g-F 50-AGAA
GAATGGTGAGGGGAAA-30 and XAL2g-R 50-ATGTTAGTTTGAAGGAG
GAA-30 primers. The 3603 nt DNA fragment was cloned in the pCR8/
GW/TOPO-TA vector, and verified by sequencing. It was then recombined
into either overexpression vectors: pGD625 (de Folter et al., 2006) or the
pK7WGF2 that includes GFP (Karimi et al., 2002) carrying a kanamycin
and spectinomycin/streptomycin resistance cassette, respectively.
Kanamycin (50 mg/ml) resistant plants were selected on plates.
In Situ Hybridization Analysis
In situ hybridization was performed according to Tapia-López et al. (2008).
In vitro transcription with the DIG RNA labeling Kit (Roche Molecular
Biochemicals) was performed to generate the antisense XAL2 probe
using as a template the XAL2-F 50-GTTTCCTCCTTCAAACTAACA-30
and XAL2-R 50-GCAACTGCTAAATTCAGTAAG-30 amplified cDNA frag-
ment cloned into p-GEM-T.
Quantitative Real-Time RT–PCR
Aerial tissue from three independent biological replicates (15 plants each)
was used for total RNA extraction with Trizol reagent, and two indepen-
dent cDNAs were reverse transcribed using Superscript II (Invitrogen).
We amplified PDF2 (AT1G13320) and UPL7 (AT1G13320) as positive
internal controls (Czechowski et al., 2005), and their stability across the
compared samples was confirmed using geNorm (Vandesompele et al.,
2002). Amplification efficiencies were analyzed using Real Time PCR
Miner (Zhao and Fernald, 2005), and relative expression was calculated
using the DDCT method (Vandesompele et al., 2002). Primer sequences
are presented in Supplemental Table 4.
Microscopy
An Olympus SZ60 dissecting microscope with C-5060 digital camera was
used for lightmicroscopy. Sectioned carpels were fixed in 4%paraformal-
dehyde, dehydrated in ethanol series, and embedded in paraffin. Sections
(8 mm) were stained with toluidine blue 0.05%. For scanning electron
microscopy, plant material was fixed at 4C overnight in 50% ethanol,
5% acetic acid, and 3.7% formaldehyde in 0.025 M phosphate buffer
(pH 7.0). Samples were subsequently washed twice (30 min) in 70%
ethanol in the same phosphate buffer, followed by 0.05 M phosphate
buffer (pH 7.0). Samples were dehydrated gradually to ethanol 100%,
and dried in liquid carbon dioxide at the critical point. Finally, samples
were covered with gold using a sputter coater and observed with a
scanning electron microscope.
TFL1 ChIP Assays
Wild-type and the 35S::GFP-XAL2 line were grown in MS plates under LD
condition and inflorescence tissue (0.5 g) was fixed for 20 min. Chromatin
was solubilized with a sonicator by three pulses of 15 s each. Immunopre-
cipitation was performed overnight using anti-GFP rabbit IgG fraction
(A11122; Invitrogen) and protein A agarose beads (Santa Cruz). Samples
were treated with proteinase K after elution followed by precipitation.
Template ChIP DNA was diluted and amplified for 35–40 cycles
(de Folter et al., 2007; de Folter, 2011). Primer pairs were designed in
flanking regions of CArG boxes found along 2 kb upstream of the start
codon, as well as 4.6 kb downstream of the TFL1 gene (Supplemental
Table 4).
GRN Model: Recovery of Gene Expression Profiles
Characteristic of VM, IM, and FM Cell Types
The GRN was modeled using a discrete multi-state GRN formalism
as described by Espinosa-Soto et al. (2004) and Álvarez-Buylla et al.
(2010a, 2010b).
Stochastic GRN Model Implementation: EL Approach
To explore the patterns of cell-fate attainment and transition among
cells, a discrete stochastic GRN dynamic model was implemented as
an extension of the deterministic Booleanmodel described in the previous
section. Stochasticity is modeled by introducing a constant probability of
error for the deterministic Boolean logical functions according to:
xiðt + 1Þ=

fiðtÞ; with prob 1 x
1 fiðtÞ; with prob x

:
We followed Álvarez-Buylla et al. (2008). This approach yields a probability
matrix that was then used to describe how the probability of being in a
particular steady state changes in time by iterating the dynamic equation
pxðt + 1Þ=pxðtÞP;
where P is the transition probability matrix and pxðtÞ the distribution vector
specifying the proportion of cells or the probability of a single one being in
each steady state at a given time.
attractors is positive or negative. Red arrows represent the globally consistent ordering for the 3(4) attractors: the order of the attractors in which all
individual transition has a positive net rate, resulting in a global probability flow across the EL as also shown in (F) (see Supplemental Methods).
(F) Schematic representation of the EL of the GRNmodeled here. The relative barrier heights represent the hierarchy of calculated positive net probability
rates, which altogether determine a consistent global ordering of the relative steady-state stabilities. According to the net probability rates, only one set of
ordered transition (VM > IM > [IM/FM] > FM) produces a positive probability flow (see Supplemental Methods). As a result, a global developmental
gradient in the EL is produced. Importantly this 2D representation is for illustrative purposes only and, as such, does not represent scales based on
exact calculated values.
Molecular Plant 8, 796–813, May 2015 ª The Author 2015. 809
XAANTAL2 (AGL14) in Arabidopsis SAM Transitions Molecular Plant
EL Exploration
To explore the EL associated with a GRN, the number, depth, width, and
relative position of the GRN attractors are represented by the hills and
valleys of Waddington’s (EL) metaphor (Álvarez-Buylla et al., 2008).
In addition to the calculation of the most probable temporal cell-fate
pattern, a discrete stochastic GRN model allows calculations of the
shortest and fastest pathways of cell-fate transitions, as well as possible
restrictions of some cell-fate transitions that also emerge from the
GRN topology and the associated EL. We calculated the mean first pas-
sage time (MFPT) between each pair of possible transitions to uncover
which of these is more feasible. MFPT was estimated numerically by
using the transition probabilities among steady states from a large number
of samples of paths simulated as a finiteMarkov chain process (Wilkinson,
2011). The MFPT from one steady state (i) to another (j) corresponds
to the average value of the number of steps taken to visit attractor j
for the first time, given that the entire probability mass was initially
localized at steady state i. The average is taken over a large number
of realizations (simulations). Based on the MFPT values, a net
transition rate between steady states i and j can be defined as follows:
di/j = 1=MFPTi/j  1=MFPTj/i. This quantity effectively measures the
facility by which a state transits from one state to another as a net
probability flow (Zhou et al., 2014). For all stochastic modeling,
robustness was assessed by comparing three different values for the
error probability (0.01, 0.05, 0.1). The number of simulated samples was
increased until stable results were attained. See also Supplemental
Methods.
SUPPLEMENTAL INFORMATION
Supplemental Information is available at Molecular Plant Online.
FUNDING
This research was supported by CONACyT (81433; 180098; 180380;
167705; 152649; 147675; 177739), PAPIIT, UNAM (IN204011-3;
IN203214-3; IN203113-3; IN203814-3), and UC-MEXUS ECO-IE415
grants. E.R.A.B. was supported by the Miller Institute for Basic Research
in Science, University of California, Berkeley, USA.
ACKNOWLEDGMENTS
This paper constitutes a partial fulfillment of the graduate program
‘‘Doctorado en Ciencias Biomédicas of the Universidad Nacional Autó-
noma de México’’ in which Rigoberto V. Pérez-Ruiz developed this proj-
ect. We acknowledge Dr. Yanofsky and Dr. Pelaz for helping at early
stages of this work. We thank researchers who shared their lines: Dr.
Alonso provided tfl1-1 mutant; Dr. Yu the co-1 allele, and the 35S::AP1
and 35S::AGL24 lines; Dr. Yanofsky the lfy-9, ap1-15, and ap1-1 cal-5mu-
tants, and Dr. Lee the agl20-101D line. Diana Romo, Dr. Martı́nez-Silva,
and K. González-Aguilera helped with logistical and technical tasks, and
Dr. Espinosa-Matias SEM preparations (Facultad de Ciencias, UNAM).
We thank Rich Jorgensen for editing. No conflict of interest declared.
Received: December 10, 2014
Revised: December 10, 2014
Accepted: January 5, 2015
Published: January 28, 2015
REFERENCES
Abe, M., Kobayashi, Y., Yamamoto, S., Daimon, Y., Yamaguchi, A.,
Ikeda, Y., Ichinoki, H., Notaguchi, M., Goto, K., and Araki, T.
(2005). FD, a bZIP protein mediating signals from the floral pathway
integrator FT at the shoot apex. Science 309:1052–1056.
Alvarez, J., Guli, C.L., Yu, X.H., and Smyth, D.R. (1992). Terminal flower:
a gene affecting inflorescence development in Arabidopsis thaliana.
Plant J. 2:103–116.
Álvarez-Buylla, E.R., Pelaz, S., Liljegren, S.J., Gold, S.E., Burgeff, C.,
Ditta, G.S., Ribas de Pouplana, L., Martinez-Castilla, L., and
Yanofsky, M.F. (2000). An ancestral MADS-box gene duplication
occurred before the divergence of plants and animals. Proc. Natl.
Acad. Sci. USA 97:5328–5333.
Álvarez-Buylla, E.R., Balleza, E., Benitez, M., Espinosa-Soto, C., and
Padilla-Longoria, P. (2008). Gene regulatory network models: a
dynamic and integrative approach to development. SEB Exp. Biol.
Ser. 61:113–139.
Álvarez-Buylla, E.R., Benitez, M., Corvera-Poire, A., Chaos Cador, A.,
de Folter, S., Gamboa de Buen, A., Garay-Arroyo, A., Garcia-
Ponce, B., Jaimes-Miranda, F., Perez-Ruiz, R.V., et al. (2010a).
Flower development. Arabidopsis Book 8:e0127.
Álvarez-Buylla, E.R., Azpeitia, E., Barrio, R., Benitez, M., and Padilla-
Longoria, P. (2010b). From ABC genes to regulatory networks,
epigenetic landscapes and flower morphogenesis: making biological
sense of theoretical approaches. Semin. Cell Dev. Biol. 21:108–117.
An, H., Roussot, C., Suarez-Lopez, P., Corbesier, L., Vincent, C.,
Pineiro, M., Hepworth, S., Mouradov, A., Justin, S., Turnbull, C.,
et al. (2004). CONSTANS acts in the phloem to regulate a systemic
signal that induces photoperiodic flowering of Arabidopsis.
Development 131:3615–3626.
Andrés, F., and Coupland, G. (2012). The genetic basis of flowering
responses to seasonal cues. Nat. Rev. Genet. 13:627–639.
Balasubramanian, S., Sureshkumar, S., Lempe, J., and Weigel, D.
(2006). Potent induction of Arabidopsis thaliana flowering by elevated
growth temperature. PLoS Genet. 2:e106.
Battey, N.H., and Lyndon, R.F. (1990). Reversion of flowering. Bot. Rev.
56:162–189.
Benlloch, R., Berbel, A., Serrano-Mislata, A., and Madueno, F. (2007).
Floral initiation and inflorescence architecture: a comparative view.
Ann. Bot. 100:1609.
Blázquez, M.A., and Weigel, D. (2000). Integration of floral inductive
signals in Arabidopsis. Nature 404:889–892.
Blázquez, M.A., Green, R., Nilsson, O., Sussman, M.R., andWeigel, D.
(1998). Gibberellins promote flowering of Arabidopsis by activating the
LEAFY promoter. Plant Cell 10:791–800.
Blázquez, M.A., Ahn, J.H., and Weigel, D. (2003). A thermosensory
pathway controlling flowering time in Arabidopsis thaliana. Nat.
Genet. 33:168–171.
Borner, R., Kampmann, G., Chandler, J., Gleissner, R., Wisman, E.,
Apel, K., and Melzer, S. (2000). A MADS domain gene involved in
the transition to flowering in Arabidopsis. Plant J. 24:591–599.
Bowman, J.L., Alvarez, J., Weigel, D., Meyerowitz, E.M., and Smyth,
D.R. (1993). Control of flower development in Arabidopsisthaliana
by APETALA1 and interacting genes. Development 119:
721–743.
Bradley, D., Ratcliffe, O., Vincent, C., Carpenter, R., and Coen, E.
(1997). Inflorescence commitment and architecture in Arabidopsis.
Science 275:80–83.
Chen, L.J., Cheng, J.C., Castle, L., and Sung, Z.R. (1997). EMF genes
regulate Arabidopsis inflorescence development. Plant Cell 9:2011–
2024.
Conti, L., and Bradley, D. (2007). TERMINAL FLOWER1 is amobile signal
controlling Arabidopsis architecture. Plant Cell 19:767–778.
Czechowski, T., Stitt, M., Altmann, T., Udvardi, M.K., and Scheible,
W.R. (2005). Genome-wide identification and testing of superior
reference genes for transcript normalization in Arabidopsis. Plant
Physiol. 139:5–17.
de Folter, S. (2011). Protein tagging for chromatin immunoprecipitation
from Arabidopsis. Methods Mol. Biol. 678:199–210.
de Folter, S., Immink, R.G.H., Kieffer, M., Parenicová, L., Henz, S.R.,
Weigel, D., Busscher, M., Kooiker, M., Colombo, L., Kater, M.M.,
810 Molecular Plant 8, 796–813, May 2015 ª The Author 2015.
Molecular Plant XAANTAL2 (AGL14) in Arabidopsis SAM Transitions
et al. (2005). Comprehensive interaction map of the ArabidopsisMADS
box transcription factors. Plant Cell 17:1424–1433.
de Folter, S., Shchennikova, A.V., Franken, J., Busscher, M., Baskar,
R., Grossniklaus, U., Angenent, G.C., and Immink, R.G.H. (2006).
A B-sister MADS-box gene involved in ovule and seed development
in petunia and Arabidopsis. Plant J. 47:934–946.
de Folter, S., Urbanus, S.L., van Zuijlen, L.G.C., Kaufmann, K., and
Angenent, G.C. (2007). Tagging of MADS domain proteins for
chromatin immunoprecipitation. BMC Plant Biol. 7:47.
Dorca-Fornell, C., Gregis, V., Grandi, V., Coupland, G., Colombo, L.,
and Kater, M.M. (2011). The Arabidopsis SOC1-like genes AGL42,
AGL71 and AGL72 promote flowering in the shoot apical and axillary
meristems. Plant J. 67:1006–1017.
Espinosa-Soto, C., Padilla-Longoria, P., and Álvarez-Buylla, E.R.
(2004). A gene regulatory network model for cell-fate determination
during Arabidopsis thaliana flower development that is robust and
recovers experimental gene expression profiles. Plant Cell 16:2923–
2939.
Ferrándiz, C., Gu, Q., Martienssen, R., and Yanofsky, M.F. (2000).
Redundant regulation of meristem identity and plant architecture
by FRUITFULL, APETALA1 and CAULIFLOWER. Development 127:
725–734.
Ferrario, S., Busscher, J., Franken, J., Gerats, T., Vandenbussche, M.,
Angenent, G.C., and Immink, R.G.H. (2004). Ectopic expression of
the petunia MADS box gene UNSHAVEN accelerates flowering and
confers leaf-like characteristics to floral organs in a dominant-
negative manner. Plant Cell 16:1490–1505.
Fornara, F., Gregis, V., Pelucchi, N., Colombo, L., and Kater, M. (2008).
The rice StMADS11-like genes OsMADS22 and OsMADS47 cause
floral reversions in Arabidopsis without complementing the svp and
agl24 mutants. J. Exp. Bot. 59:2181–2190.
Garay-Arroyo, A., Ortiz-Moreno, E., de la Paz Sánchez, M., Murphy,
A.S., Garcı́a-Ponce, B., Marsch-Martı́nez, N., de Folter, S.,
Corvera-Poire, A., Jaimes-Miranda, F., Pacheco-Escobedo, M.A.,
et al. (2013). The MADS transcription factor XAL2/AGL14 modulates
auxin transport during Arabidopsis root development by regulating
PIN expression. EMBO J. 32:2884–2895.
Gómez-Mena, C., de Folter, S., Costa, M.M., Angenent, G.C., and
Sablowski, R. (2005). Transcriptional program controlled by the
floral homeotic gene AGAMOUS during early organogenesis.
Development 132:429–438.
Gramzow, L., Ritz, M.S., and Theissen, G. (2010). On the origin of
MADS-domain transcription factors. Trends Genet. 26:149–153.
Gregis, V., Sessa, A., Colombo, L., and Kater, M.M. (2006). AGL24,
SHORT VEGETATIVE PHASE, and APETALA1 redundantly control
AGAMOUS during early stages of flower development in Arabidopsis.
Plant Cell 18:1373–1382.
Gregis, V., Sessa, A., Dorca-Fornell, C., and Kater, M.M. (2009). The
Arabidopsis floral meristem identity genes AP1, AGL24 and SVP
directly repress class B and C floral homeotic genes. Plant J.
60:626–637.
Gustafson-Brown, C., Savidge, B., and Yanofsky, M.F. (1994).
Regulation of the Arabidopsis floral homeotic gene APETALA1. Cell
76:131–143.
Halliday, K.J., Salter, M.G., Thingnaes, E., and Whitelam, G.C. (2003).
Phytochrome control of flowering is temperature sensitive and
correlates with expression of the floral integrator FT. Plant J. 33:875–
885.
Han, P., Garcia-Ponce, B., Fonseca-Salazar, G., Álvarez-Buylla, E.R.,
and Yu, H. (2008). AGAMOUS-LIKE 17, a novel flowering promoter,
acts in a FT-independent photoperiod pathway. Plant J. 55:253–265.
Hanano, S., and Goto, K. (2011). Arabidopsis TERMINAL FLOWER1 is
involved in the regulation of flowering time and inflorescence
development through transcriptional repression. Plant Cell 23:3172–
3184.
Hartmann, U., Hohmann, S., Nettesheim, K., Wisman, E., Saedler, H.,
and Huijser, P. (2000). Molecular cloning of SVP: a negative regulator
of the floral transition in Arabidopsis. Plant J. 21:351–360.
Hempel, F.D., Welch, D.R., and Feldman, L.J. (2000). Floral induction
and determination: where is flowering controlled? Trends Plant Sci.
5:17–21.
Huala, E., and Sussex, I.M. (1992). LEAFY interacts with floral homeotic
genes to regulate Arabidopsis floral development. Plant Cell 4:
901–913.
Immink, R.G.H., Tonaco, I.A.N., de Folter, S., Shchennikova, A., van
Dijk, A.D.J., Busscher-Lange, J., Borst, J.W., and Angenent, G.C.
(2009). SEPALLATA3: the ‘glue’ for MADS box transcription factor
complex formation. Genome Biol. 10:R24.
Irish, V.F., and Sussex, I.M. (1990). Function of the apetala-1 gene during
Arabidopsis floral development. Plant Cell 2:741–753.
Jaeger, K.E., and Wigge, P.A. (2007). FT protein acts as a long-range
signal in Arabidopsis. Curr. Biol. 17:1050–1054.
Jaeger, K.E., Pullen, N., Lamzin, S., Morris, R.J., and Wigge, P.A.
(2013). Interlocking feedback loops govern the dynamic behavior of
the floral transition in Arabidopsis. Plant Cell 25:820–833.
Karimi, M., Inze, D., and Depicker, A. (2002). GATEWAY vectors for
Agrobacterium-mediated plant transformation. Trends Plant Sci.
7:193–195.
Kaufmann, K., Wellmer, F., Muino, J.M., Ferrier, T., Wuest, S.E.,
Kumar, V., Serrano-Mislata, A., Madueno, F., Krajewski, P.,
Meyerowitz, E.M., et al. (2010). Orchestration of floral initiation by
APETALA1. Science 328:85–89.
Kaufmann, K., Nagasaki, M., and Jauregui, R. (2011). Modelling the
molecular interactions in the flower developmental network of
Arabidopsis thaliana. Stud. Health Technol. Inform. 162:279–297.
Koornneef, M., Hanhart, C.J., and van der Veen, J.H. (1991). A genetic
and physiological analysis of late flowering mutants in Arabidopsis
thaliana. Mol. Gen. Genet. 229:57–66.
Lee, J., and Lee, I. (2010). Regulation and function of SOC1, a flowering
pathway integrator. J. Exp. Bot. 61:2247–2254.
Lee, H., Suh, S.S., Park, E., Cho, E., Ahn, J.H., Kim, S.G., Lee, J.S.,
Kwon, Y.M., and Lee, I. (2000). The AGAMOUS-LIKE 20 MADS
domain protein integrates floral inductive pathways in Arabidopsis.
Genes Dev. 14:2366–2376.
Lee, J.Y., Baum, S.F., Alvarez, J., Patel, A., Chitwood, D.H., and
Bowman, J.L. (2005). Activation of CRABS CLAW in the nectaries
and carpels of Arabidopsis. Plant Cell 17:25–36.
Lee, J.H., Yoo, S.J., Park, S.H., Hwang, I., Lee, J.S., and Ahn, J.H.
(2007). Role of SVP in the control of flowering time by ambient
temperature in Arabidopsis. Genes Dev. 21:397–402.
Lee, J., Oh, M., Park, H., and Lee, I. (2008). SOC1 translocated to the
nucleus by interaction with AGL24 directly regulates LEAFY. Plant J.
55:832–843.
Lenhard, M., Bohnert, A., Jurgens, G., and Laux, T. (2001). Termination
of stem cell maintenance in Arabidopsis floral meristems by
interactions between WUSCHEL and AGAMOUS. Cell 105:805–814.
Li, D., Liu, C., Shen, L., Wu, Y., Chen, H., Robertson, M., Helliwell, C.A.,
Ito, T., Meyerowitz, E., and Yu, H. (2008). A repressor complex
governs the integration of flowering signals in Arabidopsis. Dev. Cell
15:110–120.
Molecular Plant 8, 796–813, May 2015 ª The Author 2015. 811
XAANTAL2 (AGL14) in Arabidopsis SAM Transitions Molecular Plant
Liljegren, S.J., Gustafson-Brown, C., Pinyopich, A., Ditta, G.S., and
Yanofsky, M.F. (1999). Interactions among APETALA1, LEAFY, and
TERMINAL FLOWER1 specify meristem fate. Plant Cell 11:1007–1018.
Liu, C., Zhou, J., Bracha-Drori, K., Yalovsky, S., Ito, T., and Yu, H.
(2007). Specification of Arabidopsis floral meristem identity by
repression of flowering time genes. Development 134:1901–1910.
Liu, C., Chen, H., Er, H.L., Soo, H.M., Kumar, P.P., Han, J.H., Liou, Y.C.,
and Yu, H. (2008). Direct interaction of AGL24 and SOC1 integrates
flowering signals in Arabidopsis. Development 135:1481–1491.
Liu, C., Xi, W.Y., Shen, L.S., Tan, C.P., and Yu, H. (2009). Regulation of
floral patterning by flowering time genes. Dev. Cell 16:711–722.
Liu, X., Kim, Y.J., Müller, R., Yumul, R.E., Liu, C., Pan, Y., Cao, X.,
Goodrich, J., and Chen, X. (2011). AGAMOUS terminates floral
stem cell maintenance in Arabidopsis by direct repressing
WUSCHEL through recruitment of polycomb group proteins. Plant
Cell 23:3654–3670.
Liu, C., Teo, Z.W., Bi, Y., Song, S., Xi, W., Yang, X., Yin, Z., and Yu, H.
(2013). A conserved genetic pathway determines inflorescence
architecture in Arabidopsis and rice. Dev. Cell 24:612–622.
Lohmann, J.U., Hong, R.L., Hobe, M., Busch, M.A., Parcy, F., Simon,
R., and Weigel, D. (2001). A molecular link between stem cell
regulation and floral patterning in Arabidopsis. Cell 105:793–803.
Mandel, M.A., and Yanofsky, M.F. (1995). A gene triggering flower
formation in Arabidopsis. Nature 377:522–524.
Martı́nez-Castilla, L.P., and Álvarez-Buylla, E.R. (2003). Adaptive
evolution in the Arabidopsis MADS-box gene family inferred from its
complete resolved phylogeny. Proc. Natl. Acad. Sci. USA 100:
13407–13412.
Masiero, S., Li, M.A., Will, I., Hartmann, U., Saedler, H., Huijser, P.,
Schwarz-Sommer, Z., and Sommer, H. (2004). INCOMPOSITA:
a MADS-box gene controlling prophyll development and floral
meristem identity in Antirrhinum. Development 131:5981–5990.
Melzer, S., Lens, F., Gennen, J., Vanneste, S., Rohde, A., and
Beeckman, T. (2008). Flowering-time genes modulate meristem
determinacy and growth form in Arabidopsis thaliana. Nat. Genet.
40:1489–1492.
Michaels, S.D., and Amasino, R.M. (1999). FLOWERING LOCUS C
encodes a novel MADS domain protein that acts as a repressor of
flowering. Plant Cell 11:949–956.
Michaels, S.D., Ditta, G., Gustafson-Brown, C., Pelaz, S., Yanofsky,
M., and Amasino, R.M. (2003). AGL24 acts as a promoter of
flowering in Arabidopsis and is positively regulated by vernalization.
Plant J. 33:867–874.
Mizukami, Y., and Ma, H. (1997). Determination of Arabidopsis floral
meristem identity by AGAMOUS. Plant Cell 9:393–408.
Moon, J., Suh, S.S., Lee, H., Choi, K.R., Hong, C.B., Paek, N.C., Kim,
S.G., and Lee, I. (2003). The SOC1 MADS-box gene integrates
vernalization and gibberellin signals for flowering in Arabidopsis.
Plant J. 35:613–623.
Müller-Xing, R., Clarenz, O., Pokorny, L., Goodrich, J., and Schubert,
D. (2014). Polycomb-group proteins and flowering locus to maintain
commitment to flowering in Arabidopsis thaliana. Plant Cell 26:2457–
2471.
Murashige, T., and Skoog, F. (1962). A revised medium for rapid
growth and bio assays with tobacco tissue cultures. Physiol. Plant. 15:
473–497.
Ng, M., and Yanofsky, M.F. (2001). Activation of the Arabidopsis B class
homeotic genes by APETALA1. Plant Cell 13:739–753.
Ohshima, S., Murata, M., Sakamoto, W., Ogura, Y., and Motoyoshi, F.
(1997). Cloning and molecular analysis of the Arabidopsis gene
terminal flower 1. Mol. Gen. Genet. 254:186–194.
Okamuro, J.K., Denboer, B.G.W., and Jofuku, K.D. (1993). Regulation
of Arabidopsis flower development. Plant Cell 5:1183–1193.
Okamuro, J.K., denBoer, B.G.W., LotysPrass, C., Szeto, W., and
Jofuku, K.D. (1996). Flowers into shoots: photo and hormonal
control of a meristem identity switch in Arabidopsis. Proc. Natl.
Acad. Sci. USA 93:13831–13836.
Okamuro, J.K., Szeto, W., LotysPrass, C., and Jofuku, K.D. (1997).
Photo and hormonal control of meristem identity in the Arabidopsis
flower mutants apetala2 and apetala1. Plant Cell 9:37–47.
Parcy, F., Bomblies, K., and Weigel, D. (2002). Interaction of LEAFY,
AGAMOUS and TERMINAL FLOWER1 in maintaining floral meristem
identity in Arabidopsis. Development 129:2519–2527.
Parenicová, L., de Folter, S., Kieffer, M., Horner, D.S., Favalli, C.,
Busscher, J., Cook, H.E., Ingram, R.M., Kater, M.M., Davies, B.,
et al. (2003). Molecular and phylogenetic analyses of the complete
MADS-box transcription factor family in Arabidopsis: new openings
to the MADS world. Plant Cell 15:1538–1551.
Porri, A., Torti, S., Romera-Branchat, M., and Coupland, G. (2012).
Spatially distinct regulatory roles for gibberellins in the promotion of
flowering of Arabidopsis under long photoperiods. Development
139:2198–2209.
Posé, D., Yant, L., and Schmid, M. (2012). The end of innocence:
flowering networks explode in complexity. Curr. Opin. Plant Biol.
15:45–50.
Putterill, J., Robson, F., Lee, K., Simon, R., and Coupland, G. (1995).
The constans gene of Arabidopsis promotes flowering and encodes
a protein showing similarities to zinc-finger transcription factors. Cell
80:847–857.
Ratcliffe, O.J., Amaya, I., Vincent, C.A., Rothstein, S., Carpenter, R.,
Coen, E.S., and Bradley, D.J. (1998). A common mechanism
controls the life cycle and architecture of plants. Development
125:1609–1615.
Ratcliffe, O.J., Bradley, D.J., and Coen, E.S. (1999). Separation of shoot
and floral identity in Arabidopsis. Development 126:1109–1120.
Rieu, I., Ruiz-Rivero, O., Fernandez-Garcia, N., Griffiths, J., Powers,
S.J., Gong, F., Linhartova, T., Eriksson, S., Nilsson, O., Thomas,
S.G., et al. (2008). The gibberellin biosynthetic genes AtGA20ox1
and AtGA20ox2 act, partially redundantly, to promote growth and
development throughout the Arabidopsis life cycle. Plant J. 53:
488–504.
Roeder, A.H., and Yanofsky, M.F. (2006). Fruit development in
Arabidopsis. Arabidopsis Book 4:e0075.
Rounsley, S.D., Ditta, G.S., and Yanofsky, M.F. (1995). Diverse roles for
MADS box genes inArabidopsis development. Plant Cell 7:1259–1269.
Roux, F., Touzet, P., Cuguen, J., and Le Corre, V. (2006). How to be
early flowering: an evolutionary perspective. Trends Plant Sci.
11:375–381.
Schmid, M., Uhlenhaut, N.H., Godard, F., Demar, M., Bressan, R.,
Weigel, D., and Lohmann, J.U. (2003). Dissection of floral induction
pathways using global expression analysis. Development 130:6001–
6012.
Schönrock, N., Bouveret, R., Leroy, O., Borghi, L., Kohler, C.,
Gruissem, W., and Hennig, L. (2006). Polycomb-group proteins
repress the floral activator AGL19 in the FLC-independent
vernalization pathway. Genes Dev. 20:1667–1678.
Schultz, E.A., and Haughn, G.W. (1991). Leafy, a homeotic gene that
regulates inflorescence development in Arabidopsis. Plant Cell
3:771–781.
Schultz, E.A., and Haughn, G.W. (1993). Genetic-analysis of the floral
initiation process (Flip) in Arabidopsis. Development 119:745–765.
812 Molecular Plant 8, 796–813, May 2015 ª The Author 2015.
Molecular Plant XAANTAL2 (AGL14) in Arabidopsis SAM Transitions
Shannon, S., and Meeks-Wagner, D.R. (1991). A mutation in the
Arabidopsis Tfl1 gene affects inflorescence meristem development.
Plant Cell 3:877–892.
Shannon, S., and Meeks-Wagner, D.R. (1993). Genetic interactions
that regulate inflorescence development in Arabidopsis. Plant Cell
5:639–655.
Sheldon, C.C., Rouse, D.T., Finnegan, E.J., Peacock, W.J., and
Dennis, E.S. (2000). The molecular basis of vernalization: the central
role of FLOWERING LOCUS C (FLC). Proc. Natl. Acad. Sci. USA
97:3753–3758.
Simpson, G.G. (2004). The autonomous pathway: epigenetic and post-
transcriptional gene regulation in the control of Arabidopsis flowering
time. Curr. Opin. Plant Biol. 7:570–574.
Smaczniak, C., Immink, R.G., Angenent, G.C., and Kaufmann, K.
(2012). Developmental and evolutionary diversity of plant MADS-
domain factors: insights from recent studies. Development 139:
3081–3098.
Smyth, D.R., Bowman, J.L., and Meyerowitz, E.M. (1990). Early flower
development in Arabidopsis. Plant Cell 2:755–767.
Srikanth, A., and Schmid, M. (2011). Regulation of flowering time: all
roads lead to Rome. Cell. Mol. Life Sci. 68:2013–2037.
Suárez-López, P., Wheatley, K., Robson, F., Onouchi, H., Valverde, F.,
and Coupland, G. (2001). CONSTANS mediates between the
circadian clock and the control of flowering in Arabidopsis. Nature
410:1116–1120.
Sugimoto, K., Gordon, S.P., andMeyerowitz, E.M. (2011). Regeneration
in plants and animals: dedifferentiation, transdifferentiation, or just
differentiation? Trends Cell Biol. 21:212–218.
Sun, B., and Ito, T. (2010). Floral stem cells: from dynamic balance
towards termination. Biochem. Soc. Trans. 38:613–616.
Sun, B., Xu, Y.F., Ng, K.H., and Ito, T. (2009). A timing mechanism for
stem cell maintenance and differentiation in the Arabidopsis floral
meristem. Genes Dev. 23:1791–1804.
Tapia-López, R., Garcia-Ponce, B., Dubrovsky, J.G., Garay-Arroyo,
A., Perez-Ruiz, R.V., Kim, S.H., Acevedo, F., Pelaz, S., and
Álvarez-Buylla, E.R. (2008). An AGAMOUS-related MADS-box gene,
XAL1 (AGL12), regulates root meristem cell proliferation and
flowering transition in Arabidopsis. Plant Physiol. 146:1182–1192.
Tooke, F., Ordidge, M., Chiurugwi, T., and Battey, N. (2005).
Mechanisms and function of flower and inflorescence reversion.
J. Exp. Bot. 56:2587–2599.
Trevaskis, B., Tadege, M., Hemming, M.N., Peacock, W.J., Dennis,
E.S., and Sheldon, C. (2007). Short vegetative phase-like MADS-box
genes inhibit floral meristem identity in barley. Plant Physiol.
143:225–235.
Vandesompele, J., De Preter, K., Pattyn, F., Poppe, B., Van Roy, N., De
Paepe, A., and Speleman, F. (2002). Accurate normalization of real-
time quantitative RT-PCR data by geometric averaging of multiple
internal control genes. Genome Biol. 3, research0034.
van Dijk, A.D., Morabito, G., Fiers, M., van Ham, R.C., Angenent, G.C.,
and Immink, R.G. (2010). Sequence motifs in MADS transcription
factors responsible for specificity and diversification of protein-
protein interaction. PLoS Comput. Biol. 6:e1001017.
van Mourik, S., van Dijk, A.D.J., de Gee, M., Immink, R.G.H.,
Kaufmann, K., Angenent, G.C., van Ham, R.C.H.J., and Molenaar,
J. (2010). Continuous-time modeling of cell fate determination in
Arabidopsis flowers. BMC Syst. Biol. 4:101.
Villarreal, C., Padilla-Longoria, P., and Álvarez-Buylla, E.R. (2012).
General theory of gene to phenotype mapping: derivation of
epigenetic landscapes from n-node complex gene regulatory
networks. Phys. Rev. Lett. 109:118102.
Wang, J.W., Czech, B., and Weigel, D. (2009). miR156-regulated SPL
transcription factors define an endogenous flowering pathway in
Arabidopsis thaliana. Cell 138:738–749.
Weigel, D., Alvarez, J., Smyth, D.R., Yanofsky, M.F., and Meyerowitz,
E.M. (1992). Leafy controls floral meristem identity in Arabidopsis. Cell
69:843–859.
Wigge, P.A., Kim, M.C., Jaeger, K.E., Busch, W., Schmid, M.,
Lohmann, J.U., and Weigel, D. (2005). Integration of spatial and
temporal information during floral induction in Arabidopsis. Science
309:1056–1059.
Wilkinson, D.J. (2011). Stochastic Modelling for Systems Biology, 2nd
edn (Boca Raton, FL, USA: Chapman & Hall/CRC Mathematical and
Computational Biology), p. 363.
Wu, G., and Poethig, R.S. (2006). Temporal regulation of shoot
development in Arabidopsis thaliana by miR156 and its target SPL3.
Development 133:3539–3547.
Yamaguchi, A., Wu, M.F., Yang, L., Wu, G., Poethig, R.S., andWagner,
D. (2009). The microRNA-regulated SBP-box transcription factor SPL3
is a direct upstream activator of LEAFY, FRUITFULL, and APETALA1.
Dev. Cell 17:268–278.
Yu, H., Xu, Y.F., Tan, E.L., and Kumar, P.P. (2002). AGAMOUS-LIKE 24,
a dosage-dependent mediator of the flowering signals. Proc. Natl.
Acad. Sci. USA 99:16336–16341.
Yu, H., Ito, T., Wellmer, F., and Meyerowitz, E.M. (2004). Repression of
AGAMOUS-LIKE 24 is a crucial step in promoting flower development.
Nat. Genet. 36:157–161.
Zhao, S., and Fernald, R.D. (2005). Comprehensive algorithm for
quantitative real-time polymerase chain reaction. J. Comput. Biol.
12:1047–1064.
Zhou, J.X., Qiu, X., Fouquier d’Hérouël, A., and Huang, S. (2014).
Discrete gene network models for understanding multicellularity and
cell reprogramming: from network structure to attractor landscape. In
Computational Systems Biology, 2nd edn, R. Eils and A. Kriete, eds.
(San diego, CA, USA: Elsevier), pp. 241–276.
Molecular Plant 8, 796–813, May 2015 ª The Author 2015. 813
XAANTAL2 (AGL14) in Arabidopsis SAM Transitions Molecular Plant
Chapter 5
Conclusiones
In spite of its familiarity, the formation of plausible conclusions
is a very subtle process.
— E . T . Jaynes, Probability Theory - The Logic of Science (2003)
En este proyecto se presenta la perspectiva de un modelo de mapeo genotipo a fenotipo
en términos del rol auto-organizacional de redes regulatorias genéticas para abordar el prob-
lema general de la decisión del destino celular. Se argumenta con base en esta perspectiva que
modelos extendidos de redes regulatorias genéticas pueden representar efectivamente un Paisaje
Epigenético subyacente a un proceso de desarrollo. La caracterización de las propiedades estruc-
turales y cuantitativas de este paisaje pueden ayudar tanto a entender como a predecir eventos
celulares durante procesos de desarrollo, y potencialmente la evolución de estos últimos.
De manera concreta, se propone un marco metodológico para extender modelos de redes reg-
ulatorias genéticas con la intensión de investigar el impacto de perturbaciones a genes espećıficos
en la toma de decisión celular como resultado de la re–estructuración del Paisaje Epigenético
subyacente (Articulo VI). Mediante la aplicación del marco metodológico al caso práctico del
desarrollo floral, se muestra que el Paisaje Epigenético puede ser re-estructurado mediante la
modulación de los tiempos caracteŕısticos de expresión de genes particulares, y se sugiere que
este fenómeno es importante para entender de manera mecanicista el funcionamiento interno
de las células durante la decisión sobre su destino. Los resultados obtenidos sugieren que existe
una relación entre el impacto de genes espećıficos en la dinámica de la red regulatoria genética,
su rol biológico y la observación jerárquica de eventos de decisión celular durante el desarrollo
temprano de la flor. Adicionalmente se especula que el rol dinámico diferencial de los genes
206
descubierto aqúı podŕıa dar información sobre la tendencia de los genes para ligar el módulo
regulatorio con otros circuitos regulatorios o v́ıas de transducción de seales.
En un segundo modelo se integraron datos experimentales en un modelo integrativo de red
de regulación genética. Se propone que la red obtenida constituye un modelo genérico para el
proceso de transformación tumorigénica potencial observado in-vitro y descrito somo el pro-
ceso de inmortalización espontanea (Art́ıculo VII). Mediante el análisis dinámico de la red y
su Paisaje Epigenético subyacente se presenta evidencia de que los componentes moleculares
y las interacciones consideradas son necesarios y suficientes para recuperar los destinos celu-
lares y transiciones observadas durante el fenómeno biológico. Cabe destacar que los destinos
celulares recuperados con en el modelo, y su patrón de transiciones, correlaciona con los pa-
trones observados durante la progresión de la carcinogenesis epitelial in vivo, esto evidenciado
por descripciones patológicas. Los resultados presentados sugieren, entonces, que la potencial
transformación tumorigénica in-vitro como resultado del proceso de inmortalización espontanea
es adecuadamente entendido y modelado al nivel celular de manera genérica como un sistema en
desarrollo que presenta decisiones del destino celular como resultado de las restricciones estruc-
turales y funcionales impuestas, en parte, por las interacciones incluidas en la red subyacente
propuesta.
Por último, bajo la hipótesis de que la relevancia funcional de un red regulatoria subyacente
a un proceso de desarrollo impide una alto grado de variación durante la evolución, en este
proyecto se prueba que los componentes de tal red involucrada en el establecimiento de los
destinos celulares durante el desarrollo temprano de la flor de Arabidopsis se encuentran con-
servados a nivel molecular a lo largo de 18 especies de plantas con flor (Articulo IX). Adicional-
mente, se prueba que existe evidencia de que la red regulatoria ha sido sometida a restricciones
funcionales durante la evolución. Los resultados presentados aqúı soportan la hipótesis original
de que la red regulatoria estudiada constituye un módulo regulatorio que regula un proceso de
desarrollo de manera robusta y que ha sido sometido a fuertes restricciones funcionales durante
la evolución.
En el proyecto en su totalidad presentamos antecedentes necesarios y propuestas de mod-
elado espećıficas para sustanciar nuestra conclusión de que el conjunto de modelos definidos
aqúı como el Paisaje Epigenético de Atractores (Art́ıculo V), se perfilan como la extensión más
207
natural para continuar el protocolo básico de modelado de redes regulatorias genéticas y asi ex-
tender el enfoque de bioloǵıa de sistemas en el estudio del desarrollo. Por último, para impulsar
esta adición al modelado en bioloǵıa de sistemas, se presenta aqúı una implementación novedosa
de los métodos de modelaje del Paisaje Epigenético de Atractores asociado a redes regulatorias
genéticas que esperamos será de utilidad para la comunicad cient́ıfica en la interface entre la
bioloǵıa y las disciplinas cuantitativas (Articulo VIII).
208
Bibliography
Azpeitia, E., Davila-Velderrain, J., Villarreal, C., y Alvarez-Buylla, E.R. Gene
regulatory network models for floral organ determination. En Flower Development, págs.
441–469. Springer (2014)
Davila-Velderrain, J., Martinez-Garcia, J., y Alvarez-Buylla, E. Descriptive vs.
Mechanistic Network Models in Plant Development in the Post-Genomic Era. Plant Func-
tional Genomics: Methods and Protocols págs. 455–479 (2015a)
Dávila-Velderrain, J. y Álvarez-Buylla Roces, E. Linear Causation Schemes in Post-
genomic Biology: The Subliminal and Convenient One-to-one Genotype-Phenotype Mapping
Assumption. INTERdisciplina 3(5) (????)
Davila-Velderrain, J., Martinez-Garcia, J.C., y Alvarez-Buylla, E.R. Epigenetic
Landscape Models: The Post-Genomic Era. bioRxiv (2014a)
Davila-Velderrain, J., Servin-Marquez, A., y Alvarez-Buylla, E.R. Molecular evo-
lution constraints in the floral organ specification gene regulatory network module across 18
angiosperm genomes. Molecular biology and evolution 31(3):560–573 (2014b)
Davila-Velderrain, J., Mart́ınez-Garćıa, J., y Alvarez-Buylla, E.R. Modeling the
Epigenetic Attractors Landscape: Towards a Post-Genomic Mechanistic Understanding of
Development. Name: Frontiers in Genetics 6:160 (2015b)
Pérez-Ruiz, R.V., Garćıa-Ponce, B., Marsch-Mart́ınez, N., Ugartechea-Chirino,
Y., Villajuana-Bonequi, M., de Folter, S., Azpeitia, E., Dávila-Velderrain, J.,
Cruz-Sánchez, D., Garay-Arroyo, A. et al. XAANTAL2 (AGL14) is an important
209
component of the complex gene regulatory network that underlies arabidopsis shoot apical
meristem transitions. Molecular plant 8(5):796–813 (2015)
210