How to get the index of a particle in snapshot file when particle ID is known?

Zhaozhou Li
  • 5
  • 15 May '16

I'm considering tracing disrupted subhalo by following its most bound paticles at the last snapshot where it is still aliave. To get the current (z=0) position/velocity of this particle, I have to compare its ID with every particle unitill matching, something like this:

for i in range(nfiles):
    for j in range(ntypes):
        for k in range(npart_thisfile):
            if ParticleID[k] == partID:
                return i, j,k

It would be quite slow when I have many particles to trace. Do you have any better ideas for getting the position/velocity of a particle when its unique ID is know?

As I know, particle ID is encoded by its initial position. Does it contain any other information?

Is it possible to tell the type of a particle (or it is stored in which file) by its ID?

Another question is that if a gas particle ceases to exist (as gas), how to trace its matter? Using tracer particles? Anyway, maybe I can use the most bound DM particle to work around.

Sorry if I missed anything obvious. Thanks a lot!

Zhaozhou Li
  • 6
  • 15 May '16

For the moment, the best way I found is building Binary Search Tree or Hash Tree. I'm not very familar with these algorithm. Not sure about the efficiency of Python (eg. python dictionary?), maybe Cython is needed.

Hmm, or build a sorted array like

ix = argsort(particleIDs)
id_arr = particleIDs[ix]
ind_arr = (index*512 + numfile)[ix]

(Use C for more efficient implementation if necessary.)

Then use binary searching across the id_arr to get the corresponding index and numfile. This array can be stored in disk thus once for all.

Dylan Nelson
  • 17 May '16

Hi Zhaozhou,

First, for DM. It depends if you need to find the index of just one ID, or if you have many IDs to locate. If just one, then loop over all DM IDs in the snapshot until you find it (you will have to load them all). If multiple, then as you say, the efficient solution will involve sorting. In particular, if you load all DM IDs, sort them, and then perform a bisection search for each target ID. In python you can use np.searchsorted(). You can also sort the search ID array as well, then perform a single/simultaneous walk down each. The complexity should be similar. See e.g. a match() in IDL for such an algorithmic idea (efficient but memory intensive).

DM ID does encode initial position but I don't see this as useful here.

However, you could accelerate this search dramatically by the fact that you know the group/subhalo membership of the target ID(s). For instance, if you have a DM ID from z=0 that you want to find its index of in previous snapshots, you could hope that it stays in the main progenitor branch (or entire merger tree), and first load only the very small subset of all DM IDs which are in that progenitor subhalo. If you find it, you're done. If you fail to find it using any such tricks, you would have to revert to again loading all DM IDs in the whole snapshot. The complexity of writing up such a search function might be useful depending on your application.

Finally, as for gas, although you can match IDs between snapshots, be aware that thinking of two gas cells which have the same IDs as being related is incorrect, because they may be exchanging mass. Indeed, as you say, to follow baryonic mass/matter through time, the tracer particles are required.

Last, there is no clear way to determine a particle type given its IDs. However, the set of DM IDs does not change in time.

For your application, I would stick to DM, and attempt the sorted bisection search on the global DM ID load at each snapshot.

Zhaozhou Li
  • 18 May '16

Very practical advices. Thanks a lot!

  • Page 1 of 1