Published Works

Filter by:
Characterize the assembly of dark matter halos with protohalo size histories: I. Redshift evolution, relation to descendant halos, and halo assembly bias
Kai Wang, et al.
arXiv, 2023

We propose a novel method to quantify the assembly histories of dark matter halos with the redshift evolution of the mass-weighted spatial variance of their progenitor halos, i.e. the protohalo size history. We find that the protohalo size history for each individual halo at z~0 can be described by a double power-law function. The amplitude of the fitting function strongly correlates to the central-to-total stellar mass ratios of descendant halos. The variation of the amplitude of the protohalo size history can induce a strong halo assembly bias effect for massive halos. This effect is detectable in observation using the central-to-total stellar mass ratio as a proxy of the protohalo size. The correlation to the descendant central-to-total stellar mass ratio and the halo assembly bias effect seen in the protohalo size are much stronger than that seen in the commonly adopted half-mass formation time derived from the mass accretion history. This indicates that the information loss caused by the compression of halo merger trees to mass accretion histories can be captured by the protohalo size history. Protohalo size thus provides a useful quantity to connect protoclusters across cosmic time and to link protoclusters with their descendant clusters in observations.

Massive Dark Matter Halos at High Redshift: Implications for Observations in the JWST Era
Yangyao Chen, et al.
ArXiv, 2023
Recent observations made by the JWST have revealed a number of massive galaxies at high redshift ($z$). The presence of these galaxies appears at odds with the current $\Lambda$CDM cosmology. Here we investigate the possibility of alleviating the tension by incorporating uncertainties from three sources in counting massive galaxies at high $z$: cosmic variance, error in stellar mass estimate, and contribution by backsplash. We find that each of the sources can significantly increase the cumulative stellar mass density $\rho_*(>M_*)$ at the high-mass end, and the combination of them can boost the density by more than one order of magnitude. Assuming a star formation efficiency of $\epsilon_* \sim 0.5$, cosmic variance alone can reduce the tension to $2\sigma$ level, except the most massive galaxy at $z=8$. Including in addition a lognormal dispersion with a width of 0.3 dex in the stellar mass can bring the observed stellar mass density at $z \sim 7 - 10$ to the $2\sigma$ range of the cosmic variance. The tension is completely eliminated when gas stripped from backsplash halos is also taken into account. Our results highlight the importance of fully modeling uncertainties when interpreting observational data of rare objects. We use the constrained simulation, ELUCID, to investigate the descendants of high $z$ massive galaxies. We find that a significant portion of these galaxies end up in massive halos with mass $M_{\rm halo} > 10^{13} h^{-1}M_\odot $ at $z=0$. A large fraction of central galaxies in $M_{\rm halo} \geqslant 10^{14.5} h^{-1}M_\odot$ halos today are predicted to contain significant amounts of ancient stars formed in massive galaxies at $z\sim 8$. This prediction can be tested by studying the structure and stellar population of central galaxies in present-day massive clusters.
A Conditional Abundance Matching Method of Extending Simulated Halo Merger Trees to Resolve Low-Mass Progenitors and Sub-halos
Yangyao Chen, et al.
MNRAS, 2023
We present an algorithm to extend subhalo merger trees in a low-resolution dark-matter-only simulation by conditionally matching them to those in a high-resolution simulation. The algorithm is general and can be applied to simulation data with different resolutions using different target variables. We instantiate the algorithm by a case in which trees from ELUCID, a constrained simulation of $(500h^{-1}{\rm Mpc})^3$ volume of the local universe, are extended by matching trees from TNGDark, a simulation with much higher resolution. Our tests show that the extended trees are statistically equivalent to the high-resolution trees in the joint distribution of subhalo quantities and in important summary statistics relevant to modeling galaxy formation and evolution in halos. The extended trees preserve certain information of individual systems in the target simulation, including properties of resolved satellite subhalos, and shapes and orientations of their host halos. With the extension, subhalo merger trees in a cosmological scale simulation are extrapolated to a mass resolution comparable to that in a higher-resolution simulation carried out in a smaller volume, which can be used as the input for (sub)halo-based models of galaxy formation. The source code of the algorithm, and halo merger trees extended to a mass resolution of $\sim 2 \times 10^8 h^{-1}M_\odot$ in the entire ELUCID simulation, are available.
Environmental dependence of the mass-metallicity relation in cosmological hydrodynamical simulations
Kai Wang, et al.
ApJ, 2023
We investigate the environmental dependence of the gas-phase metallicity for galaxies at $z=0$ to $z\gtrsim 2$ and the underlying physical mechanisms driving this dependence using state-of-the-art cosmological hydrodynamical simulations. We find that, at fixed stellar mass, central galaxies in massive halos have lower gas-phase metallicity than those in low-mass halos. On the contrary, satellite galaxies residing in more massive halos are more metal-rich. The combined effect is that massive galaxies are more metal-poor in massive halos, and low-mass galaxies are more metal-rich in massive halos. By inspecting the environmental dependence of other galaxy properties, we identify that the accretion of low-metallicity gas is responsible for the environmental dependence of central galaxies at high $z$, whereas the AGN feedback processes play a crucial role at low $z$. For satellite galaxies, we find that both the suppression of gas accretion and the stripping of existing gas are responsible for their environmental dependence, with negligible effect from the AGN feedback. Finally, we show that the difference of gas-phase metallicity as a function of stellar mass between protocluster and field galaxies agrees with recent observational results, for example from the MAMMOTH-Grism survey.
Dissect two-halo galactic conformity effect for central galaxies: The dependence of star formation activities on the large-scale environment
Kai Wang, et al.
MNRAS, 2023
We investigate the two-halo galactic conformity effect for central galaxies, which is the spatial correlation of the star formation activities for central galaxies to several Mpcs, by studying the dependence of the star formation activities of central galaxies on their large-scale structure in our local Universe using the SDSS data. Here we adopt a novel environment metric using only central galaxies quantified by the distance to the n-th nearest central galaxy. This metric measures the environment within an aperture from ~1 Mpc to ≳ 10 Mpc, with a median value of ~4 Mpc. We found that two kinds of conformity effects in our local Universe. The first one is that low-mass central galaxies are more quenched in high-density regions, and we found that this effect mainly comes from low-mass centrals that are close to a more massive halo. A similar trend is also found in the IllustrisTNG simulation, which can be entirely explained by backsplash galaxies. The second conformity effect is that massive central galaxies in low-density regions are more star-forming. This population of galaxies also possesses a higher fraction of spiral morphology and lower central stellar velocity dispersion, suggesting that their low quiescent fraction is due to less-frequent major merger events experienced in the low-density regions, and as a consequence, less-massive bulges and central black holes.
Relating galaxies across different redshift to study galaxy evolution
Kai Wang, et al.
MNRAS, 2023
We propose a general framework leveraging the galaxy-halo connection to link galaxies observed at different redshift in a statistical way, and use the link to infer the redshift evolution of the galaxy population. Our tests based on hydrodynamic simulations show that our method can accurately recover the stellar mass assembly histories up to z ~ 3 for present star-forming and quiescent galaxies down to 1010 h-1 M⊙. Applying the method to observational data shows that the stellar mass evolution of the main progenitors of galaxies depends strongly on the properties of descendants, such as stellar mass, halo mass, and star formation states. Galaxies hosted by low-mass groups/haloes at the present time have since z ~ 1.8 grown their stellar mass ~2.5 times as fast as those hosted by massive clusters. This dependence on host halo mass becomes much weaker for descendant galaxies with similar star formation states. Star-forming galaxies grow about 2-4 times faster than their quiescent counterparts since z ~ 1.8. Both TNG and EAGLE simulations overpredict the progenitor stellar mass at z > 1, particularly for low-mass descendants.
Late-formed halos prefer to host quiescent central galaxies. I. Observational results
Kai Wang, et al.
MNRAS, 2023
The star formation and quenching of central galaxies are regulated by the assembly histories of their host halos. In this work, we use the central stellar mass to halo mass ratio as a proxy of halo formation time, and we devise three different models, from the physical hydrodynamical simulation to the empirical statistical model, to demonstrate its robustness. With this proxy, we inferred the dependence of the central galaxy properties on the formation time of their host halos using the SDSS main galaxy sample, where central galaxies are identified with the halo-based group finder. We found that central galaxies living in late-formed halos have higher quiescent fractions and lower spiral fractions than their early-formed counterparts by $\lesssim 8\%$. Finally, we demonstrate that the group finding algorithm has a negligible impact on our results.
Galaxy populations in groups and clusters: evidence for a characteristic stellar mass scale at $M_\ast\sim 10^{9.5}M_\odot$
Jiacheng Meng, et al.
ApJ, 2023

We use the most recent data release (DR9) of the DESI legacy imaging survey and SDSS galaxy groups to measure the conditional luminosity function (CLF) for groups with halo mass $M_{\rm h}\ge 10^{12}M_{\odot}$ and redshift $0.01\le z\le 0.08$, down to a limiting $r$-band magnitude of $M_{\rm r}=-10\sim-12$. For a given halo mass we measure the CLF for the total satellite population, as well as separately for the red and blue populations classified using the $(g-z)$ color. We have the following findings:

  • A clear faint-end upturn is seen in the CLF of red satellites, with a slope $\alpha\approx-1.8$ which is almost independent of halo mass, This faint-end upturn is not seen for blue satellites and for the total population.
  • Our stellar population synthesis modeling shows that the $(g-z)$ color provides a clean red/blue division, and that group galaxies in the red population defined by $(g-z)$ are all dominated by old stellar populations.
  • The fraction of old galaxies as a function of galaxy luminosity shows a minimum at a luminosity $M_{\rm r}\sim-18$, corresponding to a stellar mass $M_\ast\sim10^{9.5}M_\odot$. This mass scale is independent of halo mass and is comparable to the characteristic luminosity at which galaxies show a dichotomy in surface brightness and size, suggesting that the dichotomy in the old fraction and in galaxy structure may have a common origin.
  • The rising of the old fraction at the faint end for Milky Way (MW)-sized halos found here is in good agreement with the quenched fraction measured both for the MW/M31 system and from the ELVES survey.
  • We discuss the implications of our results for the formation and evolution of low-mass galaxies, and for the stellar mass functions of low-mass galaxies to be observed at high redshift.
The Breakdown Scale of HI Bias Linearity
Zhenyuan Wang, et al.
ApJ, 2021

By employing three approaches to generate the mock HI density from an N-body simulation at low z, we check the assumption that HI gas traces the matter density distribution linearly on large scales. Our main findings are:

  • The assumption of HI linearity is valid at the scale corresponding to the first BAO peak, but breaks down at $k \geq 0.1 h {\rm Mpc}^{−1}$.
  • The nonlinear effects of halo clustering and HI content modulation counteract each other at small scales, and their competition results in a model-dependent “sweet-spot” redshift near z=1 where the HI bias is scale-independent down to small scales.
  • The linear HI bias scales approximately linearly with redshift for z ≤ 3.
MAHGIC: A Model Adapter for the Halo-Galaxy Inter-Connection
Yangyao Chen, et al.
MNRAS, 2021

We develop an empirical model pipeline, MAHGIC, to populate dark matter halos with galaxies. The main features of the model and our main results include:

  • PCA and GBDT learners are used to transform halo properties to galaxy properties.
  • Two sets of hydrodynamic simulations, TNG and EAGLE, are used to train the model, which is then applied to other DMO simulations.
  • The model can reproduce a variety of statistical properties of galaxies. It is verified reliable, flexible and accurate.
How to empirically model star formation in dark matter halos: I. Inferences about central galaxies from numerical simulations
Yangyao Chen, et al.
MNRAS, 2021

Our study provides a framework of using hydrodynamic simulations to discover, and to motivate the use of, key ingredients to model galaxy formation using halo properties. Our findings include:

  • The SFH of central galaxies are tightly related to halo MAH.
  • The classification of SF and quenched populations has significant contamination.
  • We propose a multi-stage halo-based empirical model for the star formation in central galaxies, which reproduces many galaxy statistics and galaxy-halo relations including assembly bias.
Measuring galaxy abundance and clustering at high redshift from incomplete spectroscopic data: Tests on mock catalogs and application to zCOSMOS

We build mock galaxy catalogs for high-z galaxy surveys, and we propose methods to measure GLFs, GSMFs and 2PCFs at high-z Universe. Our findings include:

  • Our methods of estimating GLFs, GSMFs and 2PCFs reliably cancel the bias from target selection and sample imcompleteness.
  • Mock catalogs are constructed for zCOSMOS-bright sample and PFS galaxy evolution survey.
  • We quantify the cosmic variance using the mocks, and find the cosmic variance is reduced by a factor of 3-4 in PFS compared with zCOSMOS.
Finding proto-clusters to trace galaxy evolution: I. The finder and its performance
Kai Wang, et al.
MNRAS, 2021

We develop a method to identify proto-clusters based on dark matter halos at high redshift. Our main findings include:

  • The test with N-body simulations shows that our finder has completeness $\sim 85\%$, purity $\geq 90\%$, mass estimates uncertainty $ \leq 0.25 {\rm dex}$.
  • Our method can recover progenitor stellar mass distribution, providing an avenue to link high-z and low-z galaxies in clusters.
An Extended Halo-based Group/Cluster Finder: Application to the DESI Legacy Imaging Surveys DR8
Xiaohu Yang, et al.
ApJ, 2021

We extend the halo-based group finder to use data simultaneously with either photometric or spectroscopic redshifts. The performance is evaluated with a mock from N-body simulation. Our main results include:

  • For magnitude $z \leq 21 $ galaxies in DESI, $\geq 60\%$ members in $\sim 90\%$ halos with $M_{\rm h} \geq 10^{12.5} h^{-1}M_\odot$ can be identified. Detected groups with $M_{\rm h} \geq 10^{12} h^{-1}M_\odot$ has purity $\geq 90\%$.
  • Group mass assignment has uncertainty from 0.2 dex (high mass end) to 0.45 dex (low mass end).
  • Group with 10 members has redshift accuracy $\sim 0.08$.
  • A group catalog is provided for DR8.
Relating the Structure of Dark Matter Halos to Their Assembly and Environment
Yangyao Chen, et al.
ApJ, 2020

We use a large N-body simulation to study the relation of structural properties of dark matter halos to their assembly history and environment. Our main conclusions are:

  • The complexity of individual halo assembly histories can be well described by a small number of principal components, which are preferred over formation times for several reasons.
  • 60%, 10%, 20% of the variances in halo concentration, axis ratio and spin, respectively, can be explained by combining four dominating predictors $\rm PC_{MAH,1}$, $M_{\rm halo}$, $\alpha_\mathcal{T}$, $b$. Degeneracies between predictors are found and analyzed, and are still hold for mass-binned samples.
  • Tidal field provides important environmental information, with $\alpha_\mathcal{T}$ shows strongest assembly bias signal.
Identifying galaxy groups at high redshift from incomplete spectroscopic data - I. The group finder and application to zCOSMOS
Kai Wang, et al.
MNRAS, 2020

High-z spectroscopic surveys, usually incomplete in redshift sampling, present both opportunities and challenges to identifying groups in the high-z Universe. We develop a group finder that is based on incomplete redshift samples combined with photometric data. Our main findings are:

  • Mock test shows that $\geq 90\%$ of groups with $M_{\rm h}\geq 10^{12} h^{-1}{\rm M}_\odot$ are successfully identified.
  • The standard deviation in the halo mass estimation is smaller than 0.25 dex at all masses.
  • We apply our group finder to zCOSMOS-bright and describe basic properties of the group catalog obtained.
ELUCID. VI. Cosmic Variance of the Galaxy Distribution in the Local Universe
Yangyao Chen, et al.
ApJ, 2019

We propose a method based on conditional stellar mass functions to estimate global GSMF. Our findings include:

  • We extend the halo merger trees from N-body simulation to a higher resolution.
  • We use constrained N-body simuation and empirical approach to construct a 'real' mock catalog, which recovers the galaxy distribution in the local Universe (SDSS volume).
  • The low-mass end GSMF estimated from SDSS sample can be significantly affected by the Cosmic Variance (CV).
  • We propose a new method based on CGSMF, provide unbiased estimate of GSMF which show significant upture below $M_* \leq 10^{9.5} h^{-1}{\rm M}_\odot$ and is missed in many earlier works.

Part-Time Works

Major contributor in the development of the following websites:

  • LIG: the data server of the galaxy and cosmology group in Tsinghua DoA.
  • Tsinghua High-z Team: the homepage of the high-z team in Tsinghua DoA.
  • ELUCID-project: the data server of the ELUCID project.

Developer of the following softwares:

  • HIPP: a modern C++ toolkit for HPC.
  • HaloProps: a calculator/predictor for halo structural properties. Python API and web application are available.
  • AstroHammer: lectures on astronomical techniques for beginners.

Academic Activity

Academic Meetings

Nov. 1-4, 2016洪山宾馆,武汉,湖北
LSSGalaxy Formation
June 11-15, 2018SJTU, Shanghai, China
Galaxy FormationLSS
July 2, 2018卢氏县, 三门峡, 河南
The 2nd East Asian Workshop on Astrostatistics
StatisticsR lang
July 9, 2018PMO, Nanjing, China
EGG Workshop
GasGalaxy Formation
July 17, 2018Tsinghua DOA
THU-Phys 2018年博士生论坛
Sept. 5, 2018稻香湖景酒店, 北京
Report: Cosmic Variance
Galaxy Formation & Evolution
Sept. 10, 2018NAOC
Report: Cosmic Variance
HUBS 2018 Workshop
X-rayWarm-hot IGM
Oct. 15, 2018Chongming Island, Shanghai, China
PFS 2018 Collab. Meeting
Galaxy Survey
Dec. 10, 2018SJTU, Shanghai, China
Report 1: Cosmic Variance
Report 2: Empirical Model
Dec. 13-15, 2018FJU, Shanghai, China
Report: Cosmic Variance
Galaxy FormationLSS
May 10-13, 2019五缘水乡酒店, 厦门, China
Report: Cosmic Variance
Galaxy Formation & Evolution
Aug. 26-27, 2019NAOC
Report: Cosmic Variance
THU-Phys 2019年博士生论坛
Aug. 30-31, 2019新华联丽景酒店,北京顺义
PFS 2019 Collab. Meeting
Galaxy Survey
Dec. 9-14, 2019Caltech, Pasadena, USA
Report: Protocluster identification
Statistical Learning in a Nutshell
ML Models & pipelines
May 16, 2019Tsinghua DOA
Talk: Statistical Learning in a Nutshell
Abstract: In previous group meetings, many examples of statistical learning algorithm, e.g., SVMs, CNNs, ensemble methods based on random forest and K-Means, etc., are presented in details. Although there are almost countless algorithms, the hard core of statistical learning is simple. In this talk, I will give an overall framework of statistical learning, list the general procedure of implementing a statistical learning model, and build connections between different models, with emphasis on the MOST important parts that we should always concern about to avoid pitfalls.
An Introduction to the ELUCID Project
Density fieldReconstruction
May 30, 2019Tsinghua DOA
Talk: An Introduction to the ELUCID Project
Abstract: ELUCID prject is a series of works carried out by Wang H. et al. It provides a framework to reconstruct the underlying initial density field from the galaxy surveys. In this talk I will introduce the idea, the algorithms, the main components, and the pipeline behind the reconstruction, including the galaxy group finder, the halo domain method, the HMCMC sampling, and the high-resolution N-body forwarding. I hope this talk can help you understand how the ELUCID pipeline works and eventually you use this database to do more science, e.g., the environmental effect on galaxy and gas, the cosmic variance on the galaxy statistics.
THCA Student Seminar: Dark Energy Model
Dark Energy Theory
Dec. 15, 2017THCA
Talk: DE Model
2017 Personal Summary
Personal Summary
Jan. 18, 2018THCA
Talk: 2017 Personal Summary
THCA Student Seminar: CALET
CALET DetectorHigh Energy
March 30, 2018THCA
2018 THCA AMD Scholarship Defense
Personal Summary
Oct. 25, 2018THCA
Talk: 2018 Personal Summary
THCA Student Seminar: Herschel
Herschel Space Telescope
Dec. 11, 2018THCA
Talk: The Herschel Space Telescope
Paper Sharing: K-Means Clustering
Jan. 1, 2019THCA
Talk: Introduce the paper "Reproducible k-means clustering in galaxy feature data from the GAMA survey"
DOA Student Seminar: Magnetic Reconnection Experiment
MR experiments
May 10, 2019Tsinghua DOA
Talk: MR Experiments
Paper Sharing: Hierarchical Bayesian
BayesianSatellite Kinematics
Sept. 20, 2019Tsinghua DOA
Talk: Introduce the paper "BASILISK: Bayesian Hierarchical Inference of the Galaxy-halo Connection using Satellite Kinematics - I. Method and Validation"
Researches on Star Formation History
July 24, 2020Zoom
Talk: Researches on SFH
2020 Science Jamboree
Aug. 25, 2020UMass, USA
Talk: Introduction to my research interests
Oct. 16, 2020Tsinghua DOA
Talk Slides
The MAHGIC Model
July 7, 2021IPMU, Japan
Talk Slides
2021 Science Jamboree
Sept. 9, 2021UMass, USA
Talk: Introduction to my research interests
Talk in a scientific way keep lawyers away