Galaxy Zoo Starburst Talk

Data update in Tools

  • KWillett by KWillett scientist

    Hi everyone,

    Many of you may have noticed some changes in the last few days to the morphology data that you can access in Tools. Below is a description of what we've modified, why we've done this, and how it might affect our collective analysis. Please post here if you have any questions, and we'll respond as soon as we can. Thanks again for all your continuing help!

    (content copied from the Galaxy Zoo blog)

    Since finishing the classifications for the GZ: Quench project, many of our volunteers have been analyzing that consensus data using the tools at tools.galaxyzoo.org. We made a few changes to the site earlier this week, and I’d like to describe them and talk about how it might affect your work on the project.

    First, a quick reminder of how the data is presented. As most of you probably remember, the classification process on GZ: Quench (and all GZ projects since GZ2) is what we call a “decision tree”. We begin with a broad question on morphology (ie, “Is this galaxy smooth, or does it have features or a disk?”) for the volunteer to answer. We then ask more specific follow-up questions that depend on the previous answers. For example – if you said the galaxy doesn’t have any spiral arms, it doesn’t make sense for us to then ask you how many arms there are – it doesn’t apply to this galaxy! So, out of 11 potential questions covering galaxy morphology, a single classifier will only answer a subset (between 4 and 9) of them. Here’s a flowchart of the decision tree for GZ: Quench — it’s an interesting exercise to look at it and work out how many unique morphologies you could sort galaxies into by going through the tree.

    GZ Quench decision tree

    So, why this discussion? When we added the data to the Tools website, we added a label in each category that gave the most common response to that question. For example, under “Arm tightness”, you could see that all galaxies were either “Tight”, “Medium”, or “Loose”. However, this is problematic when you’re trying to analyze data and compare different sets of galaxies. For smooth (or elliptical) galaxies, though, this arm classification is the result of very few votes (or even none) — they don’t represent the majority of classifications, and thus we really shouldn’t be including them when trying to compare what makes a medium-wound vs. a loosely-wound spiral.

    The solution we’ve adopted has been to edit the data on Tools — questions whose answers don’t apply to the consensus morphology (eg, spiral arms in a smooth galaxy, or the roundness of a spiral) are now blank. This means that if you look at the average color or size of any of these morphology properties, you’re now truly comparing similar groups of objects (apples to apples). Including other galaxies in earlier samples likely introduced a significant amount of bias – the science team thinks that this will largely help to address that.

    What does this mean for your analysis? Most of your old Dashboards and results should still work and remain valid results. For any work where you were analyzing morphological details (especially for spiral structure), though, we encourage you to revisit these and run them again on the new, filtered dataset. Please keep posting any questions you have on Talk, and we’ll answer them as soon as we can. Good luck!

    Posted

  • JeanTate by JeanTate in response to KWillett's comment.

    Thanks Kyle.

    (content copied from the Galaxy Zoo blog)

    For those who don't already know, Kyle is referring to GZ: Quench data update

    some changes in the last few days to the morphology data that you can access in Tools

    I can't speak for other zooites, but I have not been able to download the QS or the QC catalogs for over a week now (and several others have said they can't either). When will the download capability be restored?

    Posted

  • jules by jules moderator

    I did mention the download problem to the team but no-one has got back yet. Are you still unable to download Jean?

    Posted

  • JeanTate by JeanTate in response to jules's comment.

    Still not able to download. 😦

    I have been able to use Tools, which is good, and it seems to be working/loading much more quickly.

    Posted

  • jules by jules moderator

    Tools is working but still slow for me. 😦 And I've just sent a reminder about the download issue.

    Posted

  • mlpeck by mlpeck

    Data download seems to be working now. It just takes a very long time between clicking the button labeled "Download Data" and a prompt to save it.

    Posted

  • jules by jules moderator

    It's not working for me and Tools is even slower. I'll get back to the team!

    Posted

  • JeanTate by JeanTate in response to mlpeck's comment.

    After several (failed) attempts, I managed to replicate this experience.

    Posted

  • trouille by trouille scientist, moderator, admin

    Hi all, please check out this post and this post. If it's still not working or it's still very slow, definitely post in those discussion boards and we'll see if there's another bug we've missed.

    Thank you all for posting. Quench has done an amazing job at helping the developers get Tools to a better place!

    Posted

  • JeanTate by JeanTate in response to KWillett's comment.

    The flow chart is inconsistent with the downloaded data, in at least one respect.

    If the answer to "Could this be a disk viewed edge on?" is "Yes", then the next question - in the decision tree - is "Does the galaxy have a bulge at its center?" According to the flow chart, no matter what the answer to this question is, the next question is "Are there any off-center bright clumps embedded within the galaxy?"

    And that's what happens too, when you classify (I just checked).

    However, in the downloaded data, the field "clumps" is empty for every object, except those for which the "disk_edge" field value is "No". Specifically, in the QC catalog, none of the 627 objects for which the "disk_edge" field value is "Yes" has a (non-null) value in the "clumps" field.

    Posted

  • JeanTate by JeanTate

    Another one: the decision tree diagram has three choices as answers to the question "Is there any sign of a spiral pattern?": "Spiral", "No spiral", and "Can't tell".

    However, there are no objects, in either QS or QC, with "Can't tell".

    Perhaps "Can't tell" is not the consensus morphology - in answer to this question - for any object? As classification is now over, I can't check whether that option is offered or not.

    Posted

  • KWillett by KWillett scientist in response to JeanTate's comment.

    Hi Jean,

    That's an excellent catch by you, and you're correct. That's due to an error on my part in the consensus classification code I wrote. The tree is indeed accurate, but the code skipped over that last step (due to being imbedded in the wrong conditional statement).

    We're prepping a new version of the Quench data today that includes the new classifications from the boost phase. I've fixed this as well - you should see the galaxies with edge-on disks and off-center clumps in Tools shortly.

    Posted

  • KWillett by KWillett scientist in response to JeanTate's comment.

    Hi Jean,

    In this case, the flowchart is slightly incorrect - there ended up not being an option for "Can't tell" spiral structure in the decision tree that we used. I'll try to find the original version of this and replace it on Talk and the blog.

    Posted

  • JeanTate by JeanTate

    Thanks Kyle.

    Posted