Characterizing 'the 778': how do the QS objects differ from the matched/paired QC ones?

by JeanTate

Following the recommendation in Laura's post (OP in the Dealing with Sample Selection Issues thread), I've begun to characterize these 778 pairs. At the highest level - first question in the classification decision tree - here is the distribution:
```
   Smoo  FoD  SoA
QS  527  250   1
QC  509  267   2
```
"Smoo": "Smooth"; "FoD": "Features or Disk"; "SoA": "Star or Artifact". Yes, these look similar, and a chi-square(d) test confirms that they are (ignoring the SoA): 0.905, 1 dof, probability 0.341

Within FoD and within Smoo it's not so ("rfod" = !Eos, i.e. "No" to the question "Could this be a disk viewed edge-on?"; "rsmoo" = "Completely round" AND "In between"; "cig" = "Cigar-shaped"):
```
    Eos rfod
QS  107  143
QC   74  193
```
Chi-square(d) is 12.9, 1 dof, probability 0.000
```
    cig rsmoo
QS  126  401
QC   68  441
```
Chi-square(d) is 18.9, 1 dof, probability 0.000

Relative to the control sample, the QS objects contain more edge-on spirals (per zooites' classifications) and more cigar-shaped ones (which are - very likely - predominantly a mix of unrecognized Eos and highly-inclined spirals/disk galaxies, possibly with small or no bulges).

What about the distribution of 'BPT types', the relative proportions that are - per their position in a generalized BPT diagram - 'AGN', 'Composite', 'Star-forming', 'Weak AGN', 'Weak SF', 'Unclassifiable'? Rather different than you may have thought (next post)!

Posted December 8, 2013 10:43 PM
by JeanTate in response to JeanTate's comment.

Comparing BPT type = 'AGN/Composite/SF/weakAGN/weakSF' ("BPT") with 'unclassifiable ("!BPT"), there's no statistically significant difference between QS and the control sample:
```
   BPT !BPT
QS 738   40
QC 732   46
```
Chi-square(d) = 0.443, 1 dof, p=0.506

Similarly, the distribution of 'AGN' vs 'Composite' is not statistically significant:
```
   AGN Composite
QS 124 144
QC  55  89
```
Chi-square(d) = 2.49, 1 dof, p=0.115

But that's it ... in every other (sensible) comparison I've looked at the difference is statistically significant.

For example, here's "AGN-like" ('AGN' AND 'Composite' AND 'weak AGN') cf "SF-like" ('SF' AND 'weak SF'):
```
   AGN-like SF-like
QS  303      435
QC  210      522
```
Chi-square(d) = 10.8, 1 dof, p=0.001

The right-hand column ("SF-like") hides an astonishing difference, between "SF" and "weak SF":
```
   SF  weakSF
QS 432    3
QC 357  165
```
Chi-square(d) = 157 ( 😮) , 1 dof, p=0.000

Some of the finer differences may be new, but the main one is not. In several other posts/threads, the statistically significant "pure BPT" ('pBPT', i.e. 'AGN' AND 'Composite' AND 'SF') comparative prevalence of QS objects (cf "others", i.e. with S/N < 3 for at least one of the four emission lines) has already been noted, albeit not specifically for 'the 778'. Here are the numbers:
```
   pBPT others
QS 700    78
QC 501   277
```
Chi-square(d) = 145, 1 dof, p=0.000

Posted December 9, 2013 12:51 AM