Halo level VS sub-halo level (w. mass criteria)

André Barbosa
  • 12
  • 20 Dec '22

Hi All,

trying to adapt this snippet that retrieves a sample of SubHalo(galaxy) level:

mass_max = 10**12 / 1e10 * 1.500
search_query = "?mass__gt=" + str(mass_min) + "&mass__lt=" + str(mass_max)
search_query
url = "http://www.tng-project.org/api/Illustris-1/snapshots/z=0/subhalos/" + search_query
subhalos = get(url)
subhalos['count']

to FoF level, ie, need to extract a sample of host/parent Halos using a similar mass criteria:

mass_min = 10**12 / 1e10 * 0.704
mass_max = 10**12 / 1e10 * 1.500
search_query = "?mass__gt=" + str(mass_min) + "&mass__lt=" + str(mass_max)
search_query
url = "http://www.tng-project.org/api/Illustris-1/snapshots/z=0/halos/" + search_query
halos = get(url)
halos['count']

However, there is no "search_query" option for halos in the API.

Basically, how can we extract a list of FoF (Halos/Groups)(at z=0) for Illustris-1-Dark based on Mass Criteria using the API?

... (like a lambda function across)... otherwise only with local downloads this seems possible....

Regards,
André

Dylan Nelson
  • 21 Dec '22

Hi Andre,

That is correct, the API is built mostly around subhalos, not halos. (Same for the merger trees). So you cannot easily search on halo properties with the API.

I would suggest to simply download the Illustris-1-Dark z=0 group catalog, it is only 4.5GB, then you can run arbitrarily complex searches (e.g. with np.where()).

If you really want, you can even download just one field of the group catalog, using the [base]/groupcat-{num}/?{subset_query} method of the API. For example, for the M200c values of halos, you can just download the url:

https://www.tng-project.org/api/Illustris-1/files/groupcat-135/?Group=Group_M_Crit200

André Barbosa
  • 1
  • 21 Dec '22

Thanks Dylan,

just one follow-up question on scale/units,

if we run a dataframe.describe on the M200c values taken from FoF Illustris-TNG-1-Dark, we get this distro:

       HaloM_Crit200
count   4.231400e+06
mean    3.649974e-01
std     3.167234e+01
min     0.000000e+00
25%     1.679913e-02
50%     2.699860e-02
75%     6.119683e-02
max     2.467659e+04

Based on this, I need to extract a sample of MW like Halos within 0.7x10^12<(z=0)<1.5x10^12M. However, I cannot really understand the factor/conversion here.

So, I tried opening another catalog, already focused on MW like Galaxies, taken from (m) Disk Galaxy Kinematic Decompositions, for which we get:

        StellarMass      LogMass
count  3.931000e+03  3931.000000
mean   3.341204e+10    10.405426
std    3.412106e+10     0.293844
min    9.999000e+09     9.999957
25%    1.461000e+10    10.164650
50%    2.276000e+10    10.357172
75%    3.909000e+10    10.592066
max    5.742000e+11    11.759063

Against which I could definitely apply the filtering criteria. So I can we convert from DMc200 masses to Stellar Masses so as to apply that filtering? Otherwise what is the conversion factor that I can apply to DM200c in order to extract a MW like sample?

Many thanks for your super detailed answers to my (dumb) questions,
André

Dylan Nelson
  • 21 Dec '22

You always need to check, and carefully convert, the units of every field. You can never assume or guess what the units might be.

For the catalogs and snapshots, the data specifications documentation gives the units of each field.

For masses we have the common factor of 1e10 / h = 1e10 / 0.6774 for TNG, to convert from "code mass units" to "solar masses".

For values from anywhere else, e.g. a supplementary data catalog like "(m) Disk Galaxy Kinematic Decompositions", the units could be anything. They will be described under the documentation for that dataset. So you need to handle the unit conversions as needed.

André Barbosa
  • 21 Dec '22

Thanks Dylan,

Will use the conversion factor (0.6774/1e2) to convert the DM200c values to (1e^12M) units, and apply the filter then.

Will also check what Min Du et all have done for (m).

Regards,
André

André Barbosa
  • 2
  • 21 Dec '22

If we take the Group_M_Crit200 value from the Illustris-1-Dark Catalog (multiplied by h/100, converting 10^10M⊙/h units to 10^12M⊙), then the number of FoF(Halos/Groups) within the [0,7;1.5]10^12M bound is: 60.726 from a total of 4.231.400 -> which looks way too big of a number, so we'll need to validate it.

This will be our starting point, validating this figure first, and then apply subsequent filters around i) kinematic, ii) morphological, iii) placement within larger structure and iv) merger-tree accretion history...

Thanks for your help/guidance
André

Dylan Nelson
  • 21 Dec '22

Hello Andre,

I admit I am not totally sure of your procedure.

For your information, for Illustris-1-Dark at redshift zero, there are 1164 halos with M200c between 0.7e12 Msun and 1.5e12 Msun.

André Barbosa
  • 21 Dec '22

Definitely am missing something here, also please check these counts.

a] "https://www.tng-project.org/api/Illustris-1-Dark/files/groupcat-99/?Group=Group_M_Crit200" has 4.737.168 Halos

b] main FoF GroupCat file in the Data Downloads page https://www.tng-project.org/data/downloads/TNG100-1-Dark/ has 4.231.400 Halos

I thought the *url file shortcuts were extracts of the original FoF Catalogs -> maybe a) is not pointing to [z=0] may you please confirm?

André Barbosa
  • 3
  • 21 Dec '22

Ok Dylan,

a) was pointing to Illustris1 instead of TNG, so reconciled with the correct dataset under:

a] "https://www.tng-project.org/api/TNG100-1-Dark/files/groupcat-99/?Group=Group_M_Crit200" has 4.231.400 Halos

b] main FoF GroupCat file in the Data Downloads page https://www.tng-project.org/data/downloads/TNG100-1-Dark/ has 4.231.400 Halos

Now how to reconcile your 1164 with my (obviously wrong) 60k+? I am using this lambda filter:

df2.assign(HaloM_Crit200=lambda x: 0.6774*df2['HaloM_Crit200']/100)
count = getCount(data, lambda x: x > 0.7 and x < 1.5)
Dylan Nelson
  • 21 Dec '22

Hi Andre,

Unfortunately I am not sure what the lambda filter is, or what is is doing. If you simply load Group_M_Crit200 for TNG100-1-Dark and use numpy:

In [3]: M200 = Group_M_Crit200 * 1e10 / 0.6774

In [4]: w = np.where((M200>0.7e12) & (M200<1.5e12))[0]

In [5]: len(w)
Out[5]: 1387
André Barbosa
  • 1
  • 21 Dec '22

Bingo! I had swapped the lambda expression:

df2.assign(HaloM_Crit200=lambda x: 1e10*df2['HaloM_Crit200']/0.6774)

It's working OK now, 1387 Halos within bounds indeed - many thanks!

André Barbosa
  • 4
  • 22 Dec '22

Hi Dylan,

Hope all is well, many thanks for your help.

Next step is about getting the Halo Structure data, so trying to wget by following the FaQ:

wget -nc --content-disposition --header="API-Key: xxxMY_API_KEYxxx" "http://www.tng-project.org/api/TNG100-1-Dark/files/halo_structure.99.hdf5"

2 questions please:

1) Once we load this data, how can we join the Group FoF Catalog with the Halo_Structure? I can't find an ID which can be used to join these...

2) when I try to wget using my own API_KEY, the server bounces, back and the following happens:

Resolving www.tng-project.org (www.tng-project.org)... 130.183.17.94
Connecting to www.tng-project.org (www.tng-project.org)|130.183.17.94|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://www.tng-project.org/api/TNG100-1-Dark/files/halo_structure.99.hdf5 [following]
--2022-12-22 12:37:22--  https://www.tng-project.org/api/TNG100-1-Dark/files/halo_structure.99.hdf5
Connecting to www.tng-project.org (www.tng-project.org)|130.183.17.94|:443... connected.
HTTP request sent, awaiting response... 302 Found
Resolving data-eu.tng-project.org (data-eu.tng-project.org)... 130.183.17.94
Connecting to data-eu.tng-project.org (data-eu.tng-project.org)|130.183.17.94|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2022-12-22 12:38:03 ERROR 403: Forbidden.
Dylan Nelson
  • 22 Dec '22

(1) There are two types of datasets in general. EIther a dataset will have a size (or shape) smaller than the number of actually halos (or subhalos) at that snapshot. In this case, there needs to be some entry called e.g. "SubfindID" or "HaloID" which tells you how to join.

Or, a dataset will simply have the same size as the actual number of halos or subhalos. In this case, there is a 1-to-1 correspondence, i.e the first entry of each is the same object, and so on. This is the case with the "Halo Structure" supplementary catalog. There, you need to consider "GroupFlag", i.e. although the size is the same, some entries will be blank, as they were skipped on purpose.

(2) This works fine for me, not sure what the problem is. If you just try again, does it still not work? You can also double-check by just going to the URL in your browser.

André Barbosa
  • 22 Dec '22

Perfect thanks!

how can we add the HaloID to the subset_query bellow?

"https://www.tng-project.org/api/TNG100-1-Dark/files/groupcat-99/?Group=Group_M_Crit200"

Dylan Nelson
  • 22 Dec '22

HaloID is not a field, it is implicit. It is the index of any Group* dataset, i.e. the row number.

André Barbosa
  • 22 Dec '22

Many thanks.

André Barbosa
  • 22 Dec '22

One thing I'd like to add please, have been working in IT for 25+ years, both industry and academy, and this is by far, one of the best, well documented, amazingly well structured APIs i have ever seen.

Congratulations to the Team.

André Barbosa
  • 7
  • 22 Dec '22

Hi Dylan,

trying to compare the FoF Group Catalog with the Halo Structure Catalog.

[A]

fof_subhalo_tab_099.Group.Group_M_Crit200.hdf5
['Group_M_Crit200']
(4231400,) total number of entries
First Mass Value is 24676.59 ,ie, 364283900000000.0

[B]

halo_structure_099.hdf5
M200 = hdf5_file['M200c']
(4231400,) total number of entries - MATCH
First Mass Value is 14.56144  - NOT MATCH

Per specification the Halo Structure Catalog (https://www.tng-project.org/data/docs/specifications/#sec5q) should list the same M200c mass as in the FoF Group Catalog. Apologies to ask, but am I missing something here and reconciled the wrong data? I got these for (z=0, snap99) with Illustris Dark TNG 100.

Please Ignore - The Halo Structure contains the log(Fof M200c) - so this reconciles OK indeed. Thanks!

  • Page 1 of 1