Curated list of well-established marker genes for cell type annotation across major tissues and cell types.

---

## Table of Contents

1. PBMC / Blood Immune Cells
2. Brain / Neural Tissue
3. Epithelial Tissues
4. Stromal and Endothelial
5. Stem and Progenitor Cells
6. Marker Interpretation Guidelines

---

## PBMC / Blood Immune Cells

### T Cells

**Pan T cell markers:**
- CD3D, CD3E, CD3G - T cell receptor complex
- CD2 - T cell adhesion

**CD4+ T cells (Helper T cells):**
- CD4 - Helper T cell marker
- IL7R (CD127) - IL-7 receptor
- LDHB - Metabolic marker

**CD8+ T cells (Cytotoxic T cells):**
- CD8A, CD8B - Cytotoxic T cell markers
- GZMK, GZMB - Granzymes (cytotoxic molecules)

**Naive T cells:**
- CCR7 - Homing to lymph nodes
- SELL (CD62L) - L-selectin
- LEF1, TCF7 - Naive markers

**Memory T cells:**
- IL7R - High in central memory
- S100A4 - Memory marker
- GZMK - Effector memory

**Regulatory T cells (Tregs):**
- FOXP3 - Master regulator
- IL2RA (CD25) - High expression
- CTLA4 - Inhibitory receptor

**Activated T cells:**
- TNFRSF9 (CD137) - Activation marker
- CD69 - Early activation
- IFNG - IFN-gamma production

### B Cells

**Pan B cell markers:**
- CD79A, CD79B - B cell receptor complex
- MS4A1 (CD20) - B cell marker
- CD19 - B cell marker

**Naive B cells:**
- IGHD, IGHM - Surface IgD and IgM
- TCL1A - Naive B cell marker

**Memory B cells:**
- CD27 - Memory marker
- TNFRSF13B - Memory and plasma cell marker

**Plasma cells:**
- MZB1, SSR4 - ER proteins (high in plasma)
- JCHAIN - Immunoglobulin J chain
- XBP1 - Plasma cell transcription factor
- IGHA1, IGHG1 - Antibody heavy chains

### Monocytes

**Classical Monocytes (CD14+):**
- CD14 - LPS receptor
- S100A8, S100A9 - Inflammation markers
- FCN1 - Ficolin-1

**Non-classical Monocytes (CD16+):**
- FCGR3A (CD16) - Low-affinity IgG receptor
- CDKN1C - Cell cycle inhibitor
- MS4A7 - Membrane protein

**Intermediate Monocytes:**
- CD14 + FCGR3A - Both markers

### Dendritic Cells

**Conventional DCs (cDC):**
- FCER1A - IgE receptor
- CD1C - Antigen presentation

**Plasmacytoid DCs (pDC):**
- LILRA4 (CD85g, ILT7) - pDC marker
- CLEC4C (BDCA-2) - pDC marker
- IRF7, IRF8 - Interferon response

### NK Cells

**Natural Killer cells:**
- GNLY - Granulysin (cytotoxic)
- NKG7 - NK marker
- GZMB - Granzyme B
- NCAM1 (CD56) - NK cell marker
- KLRB1 (CD161) - NK receptor

### Other Immune Cells

**Megakaryocytes/Platelets:**
- PPBP (CXCL7) - Platelet factor 4
- PF4 - Platelet factor 4
- GP9, GP1BB - Platelet glycoproteins

**Mast cells:**
- TPSAB1, TPSB2 - Tryptases
- CPA3 - Carboxypeptidase A3
- KIT (CD117) - Stem cell factor receptor

---

## Brain / Neural Tissue

### Neurons

**Pan-neuronal markers:**
- RBFOX3 (NeuN) - Neuronal nuclei
- SNAP25 - Synaptic vesicle protein
- SYT1 - Synaptotagmin
- SLC17A7 (VGLUT1) - Glutamate transporter

**Excitatory neurons:**
- SLC17A7 (VGLUT1) - Vesicular glutamate transporter
- CAMK2A - Calcium/calmodulin kinase
- SATB2 - Upper layer marker

**Inhibitory neurons (GABAergic):**
- GAD1, GAD2 - GABA synthesis
- SLC32A1 (VGAT) - GABA transporter
- DLX1, DLX2 - Interneuron markers

**Interneuron subtypes:**
- SST - Somatostatin+ interneurons
- PVALB - Parvalbumin+ interneurons
- VIP - VIP+ interneurons

**Dopaminergic neurons:**
- TH - Tyrosine hydroxylase
- SLC6A3 (DAT) - Dopamine transporter
- DRD2 - Dopamine receptor

**Serotonergic neurons:**
- TPH2 - Tryptophan hydroxylase
- SLC6A4 (SERT) - Serotonin transporter

### Glia

**Astrocytes:**
- AQP4 - Aquaporin 4
- GFAP - Glial fibrillary acidic protein
- SLC1A2 (GLT1) - Glutamate transporter
- SLC1A3 (GLAST) - Glutamate transporter
- ALDH1L1 - Astrocyte marker

**Oligodendrocytes:**
- MBP - Myelin basic protein
- MOG - Myelin oligodendrocyte glycoprotein
- PLP1 - Proteolipid protein
- MAG - Myelin-associated glycoprotein

**Oligodendrocyte Precursor Cells (OPCs):**
- PDGFRA - PDGF receptor alpha
- CSPG4 (NG2) - Chondroitin sulfate proteoglycan
- SOX10 - Transcription factor

**Microglia:**
- CX3CR1 - Chemokine receptor
- P2RY12 - Purinergic receptor
- TMEM119 - Transmembrane protein
- AIF1 (IBA1) - Calcium-binding protein

**Ependymal cells:**
- FOXJ1 - Cilia marker
- DNAH11 - Dynein (ciliated cells)

---

## Epithelial Tissues

### Lung Epithelial

**Alveolar Type 1 (AT1):**
- AGER - Receptor for advanced glycation end products
- PDPN - Podoplanin
- CLIC5 - Chloride channel

**Alveolar Type 2 (AT2):**
- SFTPC - Surfactant protein C
- SFTPB - Surfactant protein B
- ABCA3 - ATP-binding cassette transporter

**Ciliated cells:**
- FOXJ1 - Cilia transcription factor
- DNAH5, DNAI2 - Dynein proteins

**Club cells (Clara cells):**
- SCGB1A1 - Secretoglobin
- CYP2F2 - Cytochrome P450

**Goblet cells:**
- MUC5AC, MUC5B - Mucins
- TFF3 - Trefoil factor

### Intestinal Epithelial

**Enterocytes:**
- FABP1 - Fatty acid binding protein
- APOA1 - Apolipoprotein A1
- SI - Sucrase-isomaltase

**Goblet cells:**
- MUC2 - Mucin 2
- TFF3 - Trefoil factor

**Paneth cells:**
- LYZ - Lysozyme
- DEFA5, DEFA6 - Defensins

**Enteroendocrine cells:**
- CHGA - Chromogranin A
- TPH1 - Tryptophan hydroxylase

**Tuft cells:**
- DCLK1 - Doublecortin-like kinase
- TRPM5 - Taste receptor

**Stem cells (LGR5+):**
- LGR5 - Leucine-rich repeat receptor
- OLFM4 - Olfactomedin 4

---

## Stromal and Endothelial

### Fibroblasts

**Pan-fibroblast markers:**
- COL1A1, COL1A2 - Collagen I
- COL3A1 - Collagen III
- DCN - Decorin
- LUM - Lumican

**Myofibroblasts:**
- ACTA2 (αSMA) - Alpha smooth muscle actin
- TAGLN - Transgelin
- MYH11 - Myosin heavy chain

### Endothelial Cells

**Pan-endothelial markers:**
- PECAM1 (CD31) - Platelet endothelial cell adhesion molecule
- VWF - Von Willebrand factor
- CDH5 (VE-cadherin) - Vascular endothelial cadherin

**Arterial endothelial:**
- GJA5 - Connexin 40
- EFNB2 - Ephrin B2
- DLL4 - Delta-like 4

**Venous endothelial:**
- NR2F2 - Nuclear receptor
- EPHB4 - Ephrin receptor

**Lymphatic endothelial:**
- PROX1 - Prospero homeobox 1
- FLT4 (VEGFR3) - VEGF receptor 3
- LYVE1 - Lymphatic vessel endothelial receptor

**Capillary endothelial:**
- CA4 - Carbonic anhydrase 4
- RGCC - Regulator of cell cycle

### Pericytes

- RGS5 - Regulator of G-protein signaling
- PDGFRB - PDGF receptor beta
- CSPG4 (NG2) - Chondroitin sulfate proteoglycan
- ACTA2 (low expression) - Alpha smooth muscle actin

### Smooth Muscle Cells

- ACTA2 - Alpha smooth muscle actin (high)
- TAGLN - Transgelin
- MYH11 - Myosin heavy chain 11
- CNN1 - Calponin

---

## Stem and Progenitor Cells

### Hematopoietic Stem/Progenitor

**HSCs:**
- CD34 - Hematopoietic progenitor marker
- KIT (CD117) - Stem cell factor receptor
- THY1 (CD90) - HSC marker

**Common Myeloid Progenitors:**
- MPO - Myeloperoxidase (early)
- CSF1R - M-CSF receptor

**Common Lymphoid Progenitors:**
- IL7R - IL-7 receptor
- DNTT - Terminal deoxytransferase

### Mesenchymal Stem Cells

- THY1 (CD90) - MSC marker
- ENG (CD105) - Endoglin
- NT5E (CD73) - 5' nucleotidase

### Neural Stem/Progenitor

- NES - Nestin
- SOX2 - Transcription factor
- FABP7 (BLBP) - Brain lipid binding protein
- HES1, HES5 - Notch targets

---

## Marker Interpretation Guidelines

### Best Practices

1. **Use Multiple Markers**
   - Single markers can be ambiguous
   - Confirm cell type with 3-5 markers
   - Check both positive and negative markers
2. **Consider Expression Level**
   - Some markers are low in certain states
   - Compare relative expression across clusters
   - Use dot plots to see % expressed + average expression
3. **Tissue Context Matters**
   - Markers can be tissue-specific
   - CD markers vary across tissues
   - Check literature for your specific tissue
4. **Be Aware of States**
   - Activated vs. resting
   - Mature vs. immature
   - Healthy vs. diseased

### Visualization Tips

**Feature plots:**

```
FeaturePlot(seurat_obj, features = c("CD3D", "CD4", "CD8A"))
```

**Dot plots (best for many markers):**

```
DotPlot(seurat_obj, features = c("CD3D", "CD4", "CD8A", "CD14", "MS4A1"))
```

**Violin plots:**

```
VlnPlot(seurat_obj, features = c("CD3D", "CD8A"))
```

**Heatmap of top markers per cluster:**

```
DoHeatmap(seurat_obj, features = top_markers)
```

### Annotation Workflow

1. **Run FindAllMarkers** to identify cluster-specific genes
2. **Compare to known markers** from this database
3. **Visualize candidates** with feature/dot plots
4. **Check expression patterns**:
   - Are markers co-expressed as expected?
   - Any unexpected combinations?
5. **Assign preliminary labels** based on strongest evidence
6. **Validate with literature** for your specific tissue/context
7. **Consider using automated tools** (SingleR, Azimuth) for confirmation

---

## Additional Resources

### Databases

- **CellMarker 2.0**: http://bio-bigdata.hrbmu.edu.cn/CellMarker/
- **PanglaoDB**: https://panglaodb.se/
- **CellTypist**: https://www.celltypist.org/
- **Human Cell Atlas**: https://www.humancellatlas.org/

### Automated Annotation Tools

- **SingleR** (Bioconductor): Reference-based annotation
- **Azimuth** (Seurat team): Pre-built references for common tissues
- **CellTypist**: ML-based annotation
- **scType**: Automated annotation from marker databases

---

**Last Updated:** January 2026
**Sources:** PanglaoDB, CellMarker, Human Cell Atlas, published literature
**Note:** Marker expression can vary by species, technology, and tissue context. Always validate in your specific experimental system.
