coord_ra and coord_dec values are all nan's in icSrc catalogs #425

jchiang87 · 2016-12-08T18:18:36Z

In the process of looking into comparing the inferred input magnitudes from the instance catalogs to the fluxes measured by processEimage.py, I delved into one of the icSrc catalogs:

[cori05] pwd -P
/global/project/projectdirs/lsst/phosim_deep/feasibility_study/single_raft/output_repo/icSrc/v1414156-fr/R22

In [2]: import astropy.io.fits as fits

In [3]: foo = fits.open('S11.fits')

In [4]: foo[1].data.field('coord_ra')
Out[4]: 
array([ nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan])

In [5]: foo[1].data.field('coord_dec')
Out[5]: 
array([ nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,
        nan,  nan])

In [6]:

This is using a phosim deep precursor, but I see the same thing for Run1.1 and Run3 datasets at SLAC. The recent Run3 Level 2 results seem to be here:

/global/cscratch1/sd/desc/twinkles/work/5/output

but I can't read those files since I'm not in the desc group.

What do we need to do to get the Stack to produce useful coordinates in these catalogs?

The text was updated successfully, but these errors were encountered:

SimonKrughoff · 2017-01-05T19:18:51Z

We should be reading in the catalogs with the stack persistence mechanisms, not astropy.fitsio. I don't know if it would fix this, but there is more to the persistence framework than just opening the fits file. In this case you would do either (both untested):

$> import lsst.afw.table as afwTable
$> src_cat = afwTable.SourceCatalog.readFits('S11.fits')

or better yet

$> import lsst.daf.persistence as dafPersist
$> butler = dafPersist.Butler('/global/project/projectdirs/lsst/phosim_deep/feasibility_study/single_raft/output_repo/')
$> icSrc = butler.get('icSrc', dataId={'visit':1414156, 'filter':'r', 'raft':'2,2', 'sensor':'1,1'})

jchiang87 · 2017-01-05T19:41:18Z

thanks for the untested code. is there a way of making the butler serve up data faster?

SimonKrughoff · 2017-01-05T19:51:29Z

I guess I'm not sure what you mean. I assume it is I/O bound.

jchiang87 · 2017-01-05T19:54:45Z

the butler's inherent slowness was my motivation for using astropy in this case. in other cases, it just falls down with complaints like "OperationalError: no such column: tract"

SimonKrughoff · 2017-01-05T20:02:29Z

If you try both the methods mentioned above, does one or the other perform faster?

I'd need to see the specific cases where it gives you that error, to help debug.

SimonKrughoff · 2017-01-05T20:44:19Z

O.K. I am seeing the same thing you did: i.e. coords in icSrc are nan. I think that's actually expected because the icSrc catalog is produced before the astrometric calibration. If you access the src dataset instead, you'll see that the Coord objects are populated.

Regarding the speed, I don't notice it being particularly slow. Is it possible that when you say the butler loading is slow that you are making a butler instance for every dataset you access? Instantiation of the Butler object is far slower than it should be, but you should only have to do that once.

jchiang87 · 2017-01-05T20:48:35Z

I am only doing it once in a given script, but I often have scripts that would use the butler run from the bash command line, so that start up time is always hitting me (in addition to the importing the Stack modules) and slows things down a lot...difficult to deal with when debugging stuff.

SimonKrughoff · 2017-01-05T20:56:17Z

So can you use the src dataset instead of the icSrc dataset? I.e. does that solve your problem?

I would really like to use the stack persistence framework if we can. I will help do that if I can.

jchiang87 · 2017-01-05T21:14:55Z

I think I make a good faith effort to use it first and as much as possible in production code (i.e., not necessarily in tossed off examples that help make it clear where I think a problem lies, e.g., this issue), but if I hit a roadblock, like that tract thing, I'd really rather get my own work done than chase my tail trying to get it to work (which I have spent many keystrokes in the past doing.) The main issue I have with the butler is that it doesn't offer any obvious level of introspection into what sort of data are available, hence the need to post headers of catalog fits files to know what the column names are.

SimonKrughoff · 2017-01-05T23:52:18Z

O.K. Two things.

I appreciate your trying to use the stack persistence mechanisms and if you are chasing your tail we should genuinely fix those things. The problem with using ad hoc solutions is that other people will cargo cult them and we'll be in real trouble if the underlying persistence mechanisms change (which they will). It may not be as efficient in the short run, but it might be worth it in the long run to spend a little extra time trying to get DM to fix the things that make it unusable for you. Now that we are on slack, you can @-mention me or Nate Pease at any point to get butler help. Nate is very responsive (and is at SLAC).

The main issue I have with the butler is that it doesn't offer any obvious level of introspection into what sort of data are available, hence the need to post headers of catalog fits files to know what the column names are.

This particular complaint I understand in general, but in this case isn't quite fair. If you had read the data into an afwTable.SourceCatalog object, there are mechanisms for introspecting the schema of the catalog. This is not really a butler problem.

SimonKrughoff · 2017-01-06T15:47:09Z

@jchiang87 this being closed makes me think that using the src dataset fixes your problem. Is that accurate?

jchiang87 · 2017-01-06T16:24:42Z

Is that accurate?
yes.

jchiang87 closed this as completed Jan 6, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

coord_ra and coord_dec values are all nan's in icSrc catalogs #425

coord_ra and coord_dec values are all nan's in icSrc catalogs #425

jchiang87 commented Dec 8, 2016

SimonKrughoff commented Jan 5, 2017

jchiang87 commented Jan 5, 2017

SimonKrughoff commented Jan 5, 2017

jchiang87 commented Jan 5, 2017

SimonKrughoff commented Jan 5, 2017

SimonKrughoff commented Jan 5, 2017

jchiang87 commented Jan 5, 2017

SimonKrughoff commented Jan 5, 2017

jchiang87 commented Jan 5, 2017

SimonKrughoff commented Jan 5, 2017

SimonKrughoff commented Jan 6, 2017

jchiang87 commented Jan 6, 2017

coord_ra and coord_dec values are all nan's in icSrc catalogs #425

coord_ra and coord_dec values are all nan's in icSrc catalogs #425

Comments

jchiang87 commented Dec 8, 2016

SimonKrughoff commented Jan 5, 2017

jchiang87 commented Jan 5, 2017

SimonKrughoff commented Jan 5, 2017

jchiang87 commented Jan 5, 2017

SimonKrughoff commented Jan 5, 2017

SimonKrughoff commented Jan 5, 2017

jchiang87 commented Jan 5, 2017

SimonKrughoff commented Jan 5, 2017

jchiang87 commented Jan 5, 2017

SimonKrughoff commented Jan 5, 2017

SimonKrughoff commented Jan 6, 2017

jchiang87 commented Jan 6, 2017