How to calculate neighbour(3/4Mpc) mass density for subhalos at z=0?

li yang

9 May '18

I really need to get the snapshot all particles around every subhalo at z=0 to calculate local (3/4Mpc) mass density. I just know the loadSnapSubset(basePath, snapNum, partType, fields=fields) for IDL, by the way, I already download all the particles data files as snapdir_135/. However, It is so slow if I use loadsnapsubset function, because I got 18000 subhalos at z=0 , for every subhalo I'll frequently load many times of all particles at the snapshot. So it'll take so much time that I can't make it. If I use loadsubhalo or loadhalo function for loading particles, it will miss many particles between the subhalos or halos. Finally, I have no idea for this puzzle. Could you help me out of this, I really appreciate it.

If you have access to node with sufficient memory, the approach is pretty simple, i.e.

(1) Loop over each particle type (dm, stars, gas)

(2) Load all Coordinates and Masses, at once, of the entire snapshot (with loadSubset()).

(3) Loop over all subhalos you are interested in.

(4) For each, calculate the (periodic) distance to every particle.

(5) Sum the mass of all particles within your aperture.

Now, to load 1820^3 positions and masses at once (assuming float32), you'll need ~90GB memory, so a 128 GB node should work for this. If this is too large, you can load the snapshot in arbitrary chunks, since you are only trying to do sum accumulations. Lastly, looping #4 will of course be quite slow, the alternative would be to construct a tree (e.g. scipy kdtree) and then do sphere searches on the tree, instead of brute force distance calculations.

li yang

10 May '18

Hi Dylan , How to to construct a tree (e.g. scipy kdtree) ? What is sphere searches on the tree? I don't get 'tree' you mentioned. Do you mean merger tree?
But I only need to calculate mass density of subhalos at z=0 in a sphere. For a subhalo, I don't know which file (total number of hdf5 file is 512 ) involved its neighbour dark matter particles. So I must use loadsnapsubset( ) to load all particles at 135 snapshot, and then I count the number of dark matter particles within a sphere with 3/4Mpc radius. Finally, I can get dark matter density of a subhalo. Using above method takes a lot of time for loop for loop process.Could you explain me more details about your solution.

Dylan Nelson

10 May '18

Hi Li,

By tree I mean a spatial-partitioning data structure, with the objective of accelerating spatial queries such as spherical searches. I mention this only because the distance calculations (required Nsubs * Npart times) will be expensive. Perhaps you can think about it if needed.

My above proposal is to first load all 512 files at once, into memory. Then, loop over each subhalo, and find the particles in the 3/4Mpc sphere.

In general, the particles within the 3/4Mpc sphere for a subhalo will not be contained in any one of the 512 files, so you need to search them all.

li yang

10 May '18

Hi, I only loaded one time for all dark matter particle at 135 snapshot into memory. For a subhalo, I have selected a subbox region arounding the postion of the subhalo before making spherical searches to avoiding loop (Nsub*Ndm) many times. But It still takes a long time to finish. But now I have solved the probrom by changing IDL into Python programming language (It is that canâ€™t load the dark matter particle of whole snapshot by using loadsnapsubset() ). Anyway, I really appreciate your answers with patience.

I really need to get the snapshot all particles around every subhalo at z=0 to calculate local (3/4Mpc) mass density. I just know the loadSnapSubset(basePath, snapNum, partType, fields=fields) for IDL, by the way, I already download all the particles data files as snapdir_135/. However, It is so slow if I use loadsnapsubset function, because I got 18000 subhalos at z=0 , for every subhalo I'll frequently load many times of all particles at the snapshot. So it'll take so much time that I can't make it. If I use loadsubhalo or loadhalo function for loading particles, it will miss many particles between the subhalos or halos. Finally, I have no idea for this puzzle. Could you help me out of this, I really appreciate it.

Hi Li,

If you have access to node with sufficient memory, the approach is pretty simple, i.e.

(1) Loop over each particle type (dm, stars, gas)

(2) Load all

`Coordinates`

and`Masses`

, at once, of the entire snapshot (with`loadSubset()`

).(3) Loop over all subhalos you are interested in.

(4) For each, calculate the (periodic) distance to every particle.

(5) Sum the mass of all particles within your aperture.

Now, to load

`1820^3`

positions and masses at once (assuming float32), you'll need ~90GB memory, so a 128 GB node should work for this. If this is too large, you can load the snapshot in arbitrary chunks, since you are only trying to do sum accumulations. Lastly, looping #4 will of course be quite slow, the alternative would be to construct a tree (e.g. scipy kdtree) and then do sphere searches on the tree, instead of brute force distance calculations.Hi Dylan , How to to construct a tree (e.g. scipy kdtree) ? What is sphere searches on the tree? I don't get 'tree' you mentioned. Do you mean merger tree? But I only need to calculate mass density of subhalos at z=0 in a sphere. For a subhalo, I don't know which file (total number of hdf5 file is 512 ) involved its neighbour dark matter particles. So I must use loadsnapsubset( ) to load all particles at 135 snapshot, and then I count the number of dark matter particles within a sphere with 3/4Mpc radius. Finally, I can get dark matter density of a subhalo. Using above method takes a lot of time for loop for loop process.Could you explain me more details about your solution.

Hi Li,

By tree I mean a spatial-partitioning data structure, with the objective of accelerating spatial queries such as spherical searches. I mention this only because the distance calculations (required

`Nsubs * Npart`

times) will be expensive. Perhaps you can think about it if needed.My above proposal is to first load all 512 files at once, into memory. Then, loop over each subhalo, and find the particles in the 3/4Mpc sphere.

In general, the particles within the 3/4Mpc sphere for a subhalo will not be contained in any one of the 512 files, so you need to search them all.

Hi, I only loaded one time for all dark matter particle at 135 snapshot into memory. For a subhalo, I have selected a subbox region arounding the postion of the subhalo before making spherical searches to avoiding loop (Nsub*Ndm) many times. But It still takes a long time to finish. But now I have solved the probrom by changing IDL into Python programming language (It is that canâ€™t load the dark matter particle of whole snapshot by using loadsnapsubset() ). Anyway, I really appreciate your answers with patience.