Galaxy Zoo Starburst Talk

Duplicates - summary

  • JeanTate by JeanTate

    In this thread I will collect all the Duplicate objects, posted separately in the Duplicates thread; 43 in total.

    I will collect them into three groups:

    • QS-QS duplicates (1)
    • QS-QC duplicates (29)
    • QC-QC duplicates (13)

    QS: Quench Sample

    QC: Quench Control

    Duplicate: same (physical) object, but two (or more) SDSS ObjIds. The objects are galaxies, in most cases.

    Because this version of Talk does not permit the easy identification of the URL of an individual post (within a thread), it is not possible (in most cases) to provide a direct link to posts. The AGS Ids, however, are automatically translated into URLs, so clicking on one will bring up the corresponding Examine page (which has a link to the SDSS DR9 Explore page for the object, and to a NED search result).


  • JeanTate by JeanTate


    AGS00000j6 AGS0000080


  • JeanTate by JeanTate


    (QS first, in all cases)

    AGS00000hz AGS00002i4

    AGS00002aw AGS00003xd

    AGS00000yl AGS00004ho

    AGS00001k4 AGS00004gs

    AGS00001hg AGS00002w0

    AGS00001k6 AGS00003ll

    AGS00001je AGS00004c3

    AGS000020x AGS00003tu

    AGS0000215 AGS00003wr

    AGS00000kg AGS00003tr

    AGS0000294 AGS00003xw

    AGS000025r AGS00003h9

    AGS000020j AGS00002p1

    AGS00001nc AGS00004c5

    AGS00001v8 AGS00003is

    AGS000026z AGS00003w2

    AGS00001u8 AGS00004cv

    AGS0000236 AGS00002z3

    AGS00001ci AGS00003tg

    AGS00001dp AGS00002xx

    AGS0000233 AGS00003dj

    AGS000027k AGS00002wy

    AGS000019f AGS00003c2

    AGS000012x AGS00002qc

    AGS000010e AGS00003b8

    AGS00001sa AGS00002v5

    AGS00001qr AGS0000449

    AGS000013g AGS00003e3

    AGS000009s AGS00004mc


  • JeanTate by JeanTate


    AGS00002q4 AGS00003na

    AGS00002kt AGS000034b

    AGS00003ep AGS00002x2

    AGS00003fg AGS000032d

    AGS00003ts AGS00003cf

    AGS00003pn AGS000040l

    AGS00002j1 AGS00003fb

    AGS000046u AGS00003sp

    AGS00002mx AGS00002sy

    AGS00002bi AGS000039z

    AGS00002l2 AGS00003oa

    AGS00002og AGS00002v3

    AGS00004cr AGS00004lx


  • meeka777 by meeka777

    Did you find these by looking through the objects individually or was there something else in the data that made them stand out more easily?


  • JeanTate by JeanTate in response to meeka777's comment.

    I downloaded the QS and QC catalogs, and combined them. I then found pairs of objects that are separated by < 1.1' (angular distance), and checked that each such pair is, in fact, the same galaxy (and not two separate galaxies).

    However, after I compiled this thread, I made a disturbing discovery, which I have written up in a separate thread: Oh dear!


  • mlpeck by mlpeck in response to JeanTate's comment.

    The QS-QS duplicate is at least slightly interesting. The spectra were recorded on two different plates with slightly different coordinates for the fibers, so two different regions of the same galaxy were sampled. The spectra are similar except that one is somewhat redder than the other. Only one of them is tagged as a "science primary" Here are the DR8 spectrum plots. The second one is the designated primary.

    enter image description here

    enter image description here


  • JeanTate by JeanTate in response to mlpeck's comment.


    I'd already written up this, in the Outliers thread (page 4; "Posted August 8 2013 11:23 AM"). But as there's no way to provide a direct link to that post - the thread has rather too many pages to easily scroll through (though Search does identify that thread as containing this object) - the only thing I can do is copy/paste:

    There's just one QS-QS duplicate, and it illustrates how some of the outliers came to be in the dataset. The duplicate pair is AGS00000j6 (DR7 ObjId 587731514231619685), and AGS0000080 (587731514231619686). Here are the DR7 images, in the same order:

    enter image description here enter image description here

    There are two DR7 spectra, as there must be; all QS (and QC) objects have SDSS spectra, and only DR7 spectra were used in the selection process (i.e. no BOSS spectra). The SpecIds are 117034639469576192 and 117034935805542400; here they are (in the same order):

    enter image description here
    enter image description here

    To the eye they look similar, no?

    The line data, from our QS database, are as follows (..j6 first); no errors on any except d4000 (see here for more on why):

    • d4000: 1.427±0.043 1.427±0.046
    • Halpha: 454 525
    • Hbeta: 41.9 57.1
    • NaD: 4.22 3.44
    • [NII]: 256 226
    • [OII]: 70.5 25.8
    • [OIII]: 72.3 24.9

    Other parameters:

    • redshift: 0.047840±0.000007 0.04742±0.00001
    • V_disp: 118±15 173±15
    • Log_mass: 11.04 10.78
    • Petro_R50: 0.75 4.98
    • (u, g, r, i, z): (24.7, 23.3, 19.6, 18.4, 17.5) (18.7, 16.8, 15.8, 15.4, 14.7)

    As I understand it, the line data is observational, other than a simple model for the continuum, it's just a straight-forward 'count the pixel values' thing. Ditto V_disp.

    Redshift is a bit more complicated, but it too is observational; in any case redshifts are hard to get wrong, if there's a good spectrum (i.e. one with a high signal/noise ratio)1.

    Log(mass) is derived from the spectrum, but it is heavily, heavily influenced by the (kind of) model used (see the GZ forum thread Estimating the stellar mass of an SDSS galaxy from its colors - how? for more details).

    Petro_R50, and the five 'model mags' (one for each band) however, come from the photometric pipeline; in particular, that pipeline includes a 'de-blender', which attempts - automatically - to unscramble overlapping objects, and estimate things like the magnitudes (in each band) and size (Petro_R50, in our case) for each separately. In this case, the de-blender really messed up, big time! Photometrically, AGS0000080 (587731514231619686) is a galaxy; AGS00000j6 (587731514231619685) is not.

    1 exception: overlaps, e.g. foreground star and background galaxy, or overlapping galaxies


  • mlpeck by mlpeck

    So, which to keep? My inclination would be to keep the one designated as the "science primary," but that's the one with the bad photometry at least in the DR7 database. It seems to have been corrected by DR9:


  • trouille by trouille scientist, moderator, admin

    Thank you both for these posts. This has been incredibly useful to see that there are duplicates. Check out the tomorrow and after 10am CST tomorrow (Friday) as well. Because of what you've found, I identified good replacement galaxies for all the duplicated control galaxies. That way each post-quenched galaxy will have its own unique control galaxy, which will make the analysis more straightforward.

    Whoohoo for Quenchers!! This is how a science team works. People double check each others work and make sure we're headed in the right direction. Wonderful to see happening here!


  • mlpeck by mlpeck

    Does this mean there will soon be updated data tables as well?

    I'd encourage you to add the lick index H-delta A to the data tables. It's tabulated as lick_hd_a with a corresponding error estimate in the galSpecIndx table (btw I'm using the DR9+ database -- I think some tables had different names in DR7). Here is a hint at why I think it's interesting.

    A more radical step would be to consider using DR10 data. There are outputs from a number of stellar population models there, some of which are considerably more detailed than what's tabulated in galSpecExtra (not necessarily more accurate of course).


  • Peter_Dzwig by Peter_Dzwig

    Hi all. Here is a quick observation of Qs and Qc colour magnitude (or color, if you must 😃 ). I have been looking at the comparison between the Qs and Qc colour spreads. Plotting (for example) Qs by colour gives us a spread of one colour relative to another in the dataset. Fixing on one colour (say u) and then stepping through the other colours gives the magnitude of u relative to the others. This provides a distribution of intensities of, say, u vs (g,r,iz) .We can then compare it with the corresponding plot in Qc.

    There is one overall comment: that Qs and Qc have differing distributions, except for the (i,r), (i,z) pairs.

    More broadly

    1. The Qs data is much more tightly grouped than the Qc which tends to have a broader spread
    2. The slope is different for Qs (steeper) than Qc ( a curve fit would not yield the same slopes for what are essentially distributions about a straight line for both cases).
    3. The Qc dataset has much more data at higher (ie dimmer) magnitudes than Qs
    4. The magnitudes of Qs are concentrated into a narrower range that Qc, although the lower ends are similar.

    Because I can't pick out individual objects in Tools this is more generic, than specific. Anyone any thoughts?


  • jules by jules moderator

    Hi Peter - nice work! I found a similar pattern though you explain things in much more detail than I did here.)

    I remember other threads discussing colour (!) too - probably time to get organised and collect them in a list so I'll get on with that later this week.


  • Peter_Dzwig by Peter_Dzwig

    So this is an an example:u vs g Quench-1 is Qs; QC is Quench-2


  • zutopian by zutopian in response to JeanTate's comment.

    Why does the QS sample still contain this dublicate pair? One ID must be removed.


  • zutopian by zutopian in response to zutopian's comment.

    Which are the counterpart galaxies in the QC catalog for this dublicate QS pair?
    I guess, that there are 2 different QC counterpart galaxies for this dublicate QS pair. So one of the QC counterpart galaxies must be removed, after removing one ID of the QS dublicate galaxy pair.


  • JeanTate by JeanTate in response to zutopian's comment.

    AGS0000080 has a log_mass of 10.7796 and a redshift of 0.0474274; AGS00000j6's values are 11.0397, and 0.0478403, respectively. As you say, there will be QC counterparts to both these. The two log_mass values are not really all that close, so choosing one of the QS objects to exclude also means slightly changing the distribution of stellar masses (but, as it's just one object, that change will be very small indeed).


  • zutopian by zutopian

    AGS00002q4 AGS00003na

    I missed this discussion, when the problem was outstanding. At that time I wasn't using GZQ Talk. I know, that the problem is solved, but I don't know the source of the problem and couldn't find that information. I have following puzzle.:
    The GZQ images are identical, but why do they have different SDSS-IDs (DR7)?
    Different objects (different ra/dec values) and curiously also different redshifts, when one checks the displayed IDs in DR7.: :
    The displayed image doesn't match with the displayed SDSS ID and displayed ra/dec values. I wonder, why it is messed up?
    So actually it doesn't match with the definition.: "Duplicate: same (physical) object, but two (or more) SDSS ObjIds."
    (BTW, it also doesn't match for the QC-QS cases, because the duplicate pairs have identical ObjIDs.)

    PS: I edited my post.


  • zutopian by zutopian

    ◦QS-QC duplicates (29)
    ◦QC-QC duplicates (13)

    In another topic there is given following.:

    1.The classification results for the 57 control galaxies that needed replacements have been uploaded into Quench Tools.

    I wonder, why there is a difference?
    It seems, that at the QC-QC duplicates all 26 (2x13) IDs were removed. I wonder why? Actually, I guessed, that just 13 had been removed. What is the reason, that a difference of 2 remains?