Using API on TNG100-1-Dark

Sophia Nasr

UPDATE It seems changing the limit does something, but it doesn't quite allow the scan to go through (I get a timeout). Any ideas on how to fix this?


I did an analysis in the regular TNG100-1 run using the API. Now, I'm trying to do the same thing using TNG100-1-Dark, and am getting strange errors. For example,

ids = [subhalos['results'][i]['id'] for i in range(subhalos['count'])]

should work as it scans over all available subhalos in that snap. And it does in TNG100-1. But when I try to do the same for snapshot 84 in TNG100-1-Dark, which has over 5,000,000 subhalos, it keeps telling me the index is out of range, and will only allow me to scan up to 500 so that I have to use

ids = [subhalos['results'][i]['id'] for i in range(500)]

I tested random numbers, going down from 5,000,000 all the way to 500, which I then tried to increase to 501, and it said the index is out of range despite the fact that when I check how many subhalos there are in this snapshot using print(subhalos['count']), I get 5196947. Is there a limit set that I have to override when scanning the Dark matter only runs? I didn't have to set any limit in the full TNG100-1 run, so I was wondering if perhaps something is different about the Dark run, or maybe if there's a limit that's been placed on your end. Can we navigate this in a way so I can scan over all the subhalos in a snapshot?

Thank you!

Dylan Nelson
  • 8 Apr

Hi Sophia,

Any API request towards an endpoint like

will by default return 100 results. I suspect you have already changed these defaults then? Can you post a minimal working example of code which reproduces the issue?

Sophia Nasr

Hi Dylan,

Here's the code I use to search for specific subhalos in the snapshot:

limNumMassCriteria = 40000
limNum = 400000
mass_dm_min = 10**13 / 1e10 * 0.6774
mass_dm_max = 6*10**13 / 1e10 * 0.6774
# mass_stars_min = 8*10**10 / 1e10 * 0.6774
# mass_stars_max = 10**12 / 1e10 * 0.6774
redshift = 0.2

def querySubhalosDM(mass_dm_min, mass_dm_max, limNumMassCriteria, limNum, simRun, redshift, snap):
    def get(path, params=None):
        # make HTTP GET request to path
        r = requests.get(path, params=params, headers=headers)

        # raise exception if response code is not HTTP SUCCESS (200)

        if r.headers['content-type'] == 'application/json':
            return r.json()  # parse json responses automatically
        return r

    # form the search_query string by hand for once
    search_query = "?mass_dm__gt=" + str(mass_dm_min) + "&mass_dm__lt=" + str(mass_dm_max)

    # form the url and make the request
    url = "" + simRun + "/snapshots/z=" + str(redshift) + "/subhalos/" + search_query
    subhalos = get(url, {'limit': limNumMassCriteria})

    # return ids of halos falling in query criteria

    ids = [subhalos['results'][i]['id'] for i in range(40000)]
  #ids = [subhalos['results'][i]['id'] for i in range(subhalos['count'])]

    subs = get(snap['subhalos'], {'limit': limNum, 'order_by': 'id'})

    return subhalos, ids, subs'

It looks like it worked this time, but now it's searching for which of those subhalos are primary subhalos which is taking long, I suspect because it found many since my limit is 40,000??

Dylan Nelson
  • 9 Apr


To avoid ever producing an error, you should only loop over the actual number of responses.

So ids = [subhalos['results'][i]['id'] for i in range(len(subhalos['results'])].

I would suggest limit=5000 or so, so that each return is faster. Then, because the results are paginated, you need to walk through them.

subhalos = {'next':query_url}

while subhalos['next'] is not None: # or maybe, while subhalos['next'] != "":
    subhalos = get(...)
    for i in range(len(subhalos['results'])):
        # do something
    # note here subhalos['next'] is a new URL pointing to the next page, please see API documentation
Sophia Nasr
  • 10 Apr

Okay, thank you very much!

  • Page 1 of 1