SubhaloIDs and merger trees

Jonathan Mack
  • 1
  • 29 Apr '16

I'm interested in the merger history of individual subhalos, in hopes of deriving aggregate results from the individual histories. The required processing time to do so, however, is large enough such that writing Python code against downloaded group catalogs and merger tree files appears the way to go. To begin this analysis, I need to get all of the unique-across-all-snapshots subhalo IDs in each snapshot, but I can't figure out a way to do that without using the browser or web-based APIs, which I'd rather not do because of said performance issues. Is there a way, then, to get all of those IDs using just Python and the downloaded data? Also, it would simplify my analysis if it were true that every ID'ed subhalo is included in a merger tree. Is that the case?

Thanks!

Dylan Nelson
  • 1
  • 29 Apr '16

Hi Jonathan,

The term "IDs" (refering to halos or subhalos) is interchangeable with "indices", e.g. a list of all the "SubfindIDs" (edited, this was always meant to say SubfindIDs) in a snapshot is just a list of integers from 0 to the total number of subhalos in that snapshot, minus one:

import illustris_python as il
basePath = './Illustris-1/'
h = il.groupcat.loadHeader(basePath,135)

>>> h['Nsubgroups_Total']
4366546

>>> inds = np.arange(h['Nsubgroups_Total'])
>>> inds.min(), inds.max(), inds.size
(0, 4366545, 4366546)
Dylan Nelson
  • 2
  • 29 Apr '16

It isn't absolutely guaranteed that all SubfindIDs (across all snapshots) will exist in some tree, at least in the SubLink trees. But, it should be rare, especially with objects which are large enough to be at all resolved.

I'd suggest to just check for this case, and handle it if/when it comes up. E.g.

w = np.where( (f['SubfindID'] == subid) & (f['SnapNum'] == snap) )
if len(w[0]) == 0:
    raise Exception('This subhalo not in any tree (at least in file f).')
Jonathan Mack
  • 3
  • 29 Apr '16

The term "IDs" (refering to halos or subhalos) is interchangeable with "indices", e.g. a list of all the "SubhaloIDs" in a snapshot is just a list of integers from 0 to the total number of subhalos in that snapshot, minus one:

This appears to imply that a single subhaloID is not unique across all snapshots, and as such, could not be "the unique ID (within the whole simulation) of the corresponding subhalo", which is what I believe I need as the 'id' input for, for example, sublink.loadTree(). If that's the case, what's the ID that I'd need for loadTree() and numMergers(), and is there a way to obtain all of the ones in a particular snapshot using only Python and downloaded data?

Dylan Nelson
  • 29 Apr '16

Hi,

Apologies, this is the difference between SubfindID and SubhaloID.

The SubfindID is what is input to the sublink.loadTree() function, this corresponds to indices for subhalos in the group catalogs, as I described above.

The SubhaloID is essentially an internal value for the trees and you can ignore this, unless digging into how the trees are stored on disk and/or loaded efficiently.

Jonathan Mack
  • 4 May '16

Thanks for all your time; it's really helped me out a lot. One more (hopefully last) question: even if SubhaloID isn't necessarily meant to be used as unique-but-constant-across-all-snapshots-and-trees-of-a-run identifier, can I do so anyway? For instance:

tree = il.sublink.loadTree(basePath, 27, 0, fields=fields)  
print(tree['SubhaloID'], tree['SubfindID'], tree['SnapNum'],
    tree['MainLeafProgenitorID'], tree['SubhaloMassInRadType'][:,4])

gives

[30000000100047761 30000000100047762] [0 0] [27 26] [30000000100047762 30000000100047762] [ 0.  0.]

It looks like I can use the combination of SubfindID and SnapNum to uniquely identify a subhalo, but it also appears that the format of all the Sublink ...ID fields (MainLeafProgenitorID used as the example above), is in the form of SubhaloID, implying that SubhaloID should be what I should use to establish tree relationships between subhalos. Thoughts?

Dylan Nelson
  • 4 May '16

Hi,

You're right you can use either the SubfindID,SnapNum pair of the SubhaloID in order to uniquely identify a subhalo in a given simulation.

And as you say, fields such as LastProgenitorID or MainLeafProgenitorID refer to the SubhaloID of another entry in the tree.

Jonathan Mack
  • 4 May '16

You've answered all my questions. Thanks so much!

  • Page 1 of 1