WormMine

WS295

Intermine data mining platform for C. elegans and related nematodes

Gene :

WormBase Gene ID  ? WBGene00004945 Gene Name  sop-2
Sequence Name  ? C50E10.4 Brief Description  sop-2 encodes a SAM domain-containing protein that is related to, but not orthologous with, Polycomb group proteins and ETS transcription factors; during development, SOP-2 activity is required for the proper integration of sexual, spatial, and temporal information during cell fate specification; specifically, SOP-2 is required for maintaining a restricted pattern of Hox gene expression, such as that of mab-5 and egl-5, to specific cells and tissues such as the serotonergic and dopaminergic male tail neurons and the ventral nerve cord; in addition, SOP-2 is required for the proper sexual and temporal specification of the hypodermal seam cells; in regulating temporal fate specification, genetic analyses indicate that sop-2 acts upstream of lin-29 in the heterochronic pathway and interacts with other members of the heterochronic pathway, such as lin-4, let-7, and hbl-1, to integrate temporal information and cell fate specification; sop-2 mutations also result in an increase in ALG-1-containing P bodies in hypodermal seam cells; in regulating neurotransmitter phenotype, sop-2 functions together with sor-3, which also encodes a Polycomb group protein, and members of the TGF-beta signaling pathway; sop-2 and sor-3 also function together to regulate progression through larval development; a SOP-2::GFP reporter fusion is expressed in the nuclei of all somatic cells beginning at the 50-cell stage of embryogenesis; SOP-2::GFP expression is initially diffuse, but by the 200-cell stage is visible in distinct nuclear bodies, the size and number of which may correlate with DNA content.
Organism  Caenorhabditis elegans Automated Description  Enables RNA binding activity; protein homodimerization activity; and ubiquitin protein ligase binding activity. Involved in several processes, including negative regulation of cytoplasmic mRNA processing body assembly; neuron differentiation; and regulation of gene expression. Located in nuclear speck. Expressed widely.
Biotype  SO:0001217 Genetic Position  II :8.55189 ±0.18042
Length (nt)  ? 4856
Quick Links:
 
Quick Links:
 

1 Organism

Name Taxon Id
Caenorhabditis elegans 6239

1 Synonyms

Value
WBGene00004945

Genomics

3 Transcripts

WormMine ID Sequence Name Length (nt) Chromosome Location
Transcript:C50E10.4a.1 C50E10.4a.1 2599   II: 12339944-12344799
Transcript:C50E10.4b.1 C50E10.4b.1 2601   II: 12339944-12344795
Transcript:C50E10.4c.1 C50E10.4c.1 1530   II: 12341954-12344410
 

Other

3 CDSs

WormMine ID Sequence Name Length (nt) Chromosome Location
CDS:C50E10.4a C50E10.4a 2208   II: 12339946-12340035
CDS:C50E10.4b C50E10.4b 2214   II: 12339946-12340035
CDS:C50E10.4c C50E10.4c 1530   II: 12341954-12342268

21 RNAi Result

WormBase ID
WBRNAi00067610
WBRNAi00067651
WBRNAi00068171
WBRNAi00042458
WBRNAi00042901
WBRNAi00070225
WBRNAi00011935
WBRNAi00012215
WBRNAi00076228
WBRNAi00030027
WBRNAi00086989
WBRNAi00086988
WBRNAi00086990
WBRNAi00089599
WBRNAi00089629
WBRNAi00086991
WBRNAi00087183
WBRNAi00070226
WBRNAi00070227
WBRNAi00086992
WBRNAi00117641

141 Allele

Public Name
gk963801
gk963053
gk962684
gk964116
WBVar01892050
WBVar00554104
WBVar01605488
WBVar02086400
WBVar02040699
WBVar00177373
WBVar01247193
WBVar02067097
WBVar01605490
WBVar01605491
WBVar01605492
WBVar01605484
WBVar01605485
WBVar01605487
WBVar01605489
h9094
h7573
WBVar01378801
WBVar01378791
WBVar01378792
WBVar01378793
WBVar01655441
WBVar01655440
otn1469
otn14478
otn14479

1 Chromosome

WormBase ID Organism Length (nt)
II Caenorhabditis elegans 15279421  

1 Chromosome Location


Feature . Primary Identifier
Start End Strand
WBGene00004945 12339944 12344799 1

3 Data Sets

Name URL
WormBaseAcedbConverter  
GO Annotation data set  
C. elegans genomic annotations (GFF3 Gene)  

1 Downstream Intergenic Region

WormBase ID Name Sequence Name Length (nt) Chromosome Location Organism
intergenic_region_chrII_12344800..12344808   9 II: 12344800-12344808 Caenorhabditis elegans

133 Expression Clusters

Regulated By Treatment Description Algorithm Primary Identifier
  Transcripts that showed significantly increased expression in L1 neural cells comparing to in adult neural cells. DESeq2 (v1.18.1) fold change > 2, P-adj<0.05, using BenjaminiHochberg correction. WBPaper00060811:L1_vs_adult_upregulated_neural
  Transcripts expressed in neuronal cells, by analyzingfluorescence-activated cell sorted (FACS) neurons. DESeq. False discovry rate (FDR) < 0.1. WBPaper00048988:neuron_expressed
  mRNAs that showed decreased expression in 1 cell mebryo comparing to in oocyte, according to RNAseq analysis. Gaussian error propagation. As cutoff for the up-regulated genes authors used log2 fold change > 1 and P < 0.05 and as cutoff for the down-regulated genes authors used log2 fold change < -1 and P < 0.05. WBPaper00045420:fertilization_downregulated_transcript
Osmotic stress Transcripts that showed significantly altered expression with 500 mM salt (NaCl) vs 100 mM salt when food was present DESeq(version 1.10.1), FDR < 0.05. WBPaper00050726:OsmoticStress_regulated_Food
  Genes that showed increased expression in wdr-5(ok1417) comparing with in N2. Statistical analysis for misexpression was performed using a moderated t test from the package limma. All genes with a false discovery rate (FDR) of <= 5% (p <= 0.05) were selected as differentially regulated. WBPaper00045861:wdr-5(ok1417)_upregulated
  Transcripts that showed significantly higher expression in somatic gonad precursor cells (SGP) vs. head mesodermal cells (hmc). DESeq2, fold change >= 2, FDR <= 0.01. WBPaper00056826:SGP_biased
  Transcripts expressed in body muscle, according to PAT-Seq analysis using Pmyo-3-GFP-3XFLAG mRNA tagging. Cufflinks FPKM value >=1. WBPaper00050990:body-muscle_expressed
  Transcripts expressed in hypodermis, according to PAT-Seq analysis using Pdpy-7-GFP-3XFLAG mRNA tagging. Cufflinks FPKM value >=1. WBPaper00050990:hypodermis_expressed
  Transcripts expressed in intestine, according to PAT-Seq analysis using Pges-1-GFP-3XFLAG mRNA tagging. Cufflinks FPKM value >=1. WBPaper00050990:intestine_expressed
  Transcripts expressed in NMDA neuron, according to PAT-Seq analysis using Pnmr-1-GFP-3XFLAG mRNA tagging. Cufflinks FPKM value >=1. WBPaper00050990:NMDA-neuron_expressed
  Genes with expression level regulated by genotype (N2 vs CB4856) and age at old adults stage (214 hours at 24 centigrade). For model 2, authors used 100 permutations to estimate the FDR threshold. Per permutation, genotypes and ages were independently randomly distributed, keeping the among-gene structure intact. Then for each spot (23,232) on the array, model 2 was tested. The obtained P-values were used to estimate a threshold for each of the explanatory factors. Authors also used a genome-wide threshold of -log10 P-value = 2, which resembles an FDR of 0.072 and 0.060 for marker and the interaction age-marker for the developing worms and FDR of 0.050 and 0.065 for marker and age-marker for the aging worms. For the physiological age effect, authors used a log10 P-value = 8 in developing worms (0.012 FDR) and -log10 P-value = 6 (0.032 FDR). WBPaper00040858:eQTL_age_regulated_aging
  Genes with expression level regulated by genotype (N2 vs CB4856) and age at L3 larva and Late reproduction stage (96 hours at 24 centigrade). For model 2, authors used 100 permutations to estimate the FDR threshold. Per permutation, genotypes and ages were independently randomly distributed, keeping the among-gene structure intact. Then for each spot (23,232) on the array, model 2 was tested. The obtained P-values were used to estimate a threshold for each of the explanatory factors. Authors also used a genome-wide threshold of -log10 P-value = 2, which resembles an FDR of 0.072 and 0.060 for marker and the interaction age-marker for the developing worms and FDR of 0.050 and 0.065 for marker and age-marker for the aging worms. For the physiological age effect, authors used a log10 P-value = 8 in developing worms (0.012 FDR) and -log10 P-value = 6 (0.032 FDR). WBPaper00040858:eQTL_age_regulated_developing
Bacteria diet: Escherichia coli HB101. Fed for 30 generations. Transcripts that showed significantly decreased expression after fed by bacteria E. coli HB101 for 30 generations comparing to animals fed by E. coli OP50. DESeq2 fold change > 2, p-value < 0.01. WBPaper00061007:HB101_downregulated
Bacteria diet: Sphingomonas aquatilis Yellow. Fed for 30 generations. Transcripts that showed significantly decreased expression after fed by bacteria Sphingomonas aquatilis (Yellow) for 30 generations comparing to animals fed by E. coli OP50. DESeq2 fold change > 2, p-value < 0.01. WBPaper00061007:S.aquatilis_downregulated
  Maternal class (M): genes that are called present in at least one of the three PC6 replicates. A modified Welch F statistic was used for ANOVA. For each gene, regressed error estimates were substituted for observed error estimates. The substitution is justified by the lack of consistency among the most and least variable genes at each time point. Regressed error estimates were abundance-dependent pooled error estimates that represented a median error estimate from a window of genes of similar abundance to the gene of interest. A randomization test was used to compute the probability Pg of the observed F statistic for gene g under the null hypothesis that developmental time had no effect on expression. P-values were not corrected for multiple testing. [cgc5767]:expression_class_M
  Transcripts that showed significantly increased expression in hrde-1(tm1200) animals, comparing to in N2, after growing at 25C for five generations (late generation). CuffDiff2 WBPaper00051265:F4_hrde-1(tm1200)_upregulated
  Transcripts that showed significantly increased expression in aak-1(tm1944);aak-2(ok524) animals comparing to in N2. DEseq 1.18.0, adjusted p-value < 0.05. WBPaper00056471:aak-1(tm1944);aak-2(ok524)_upregulated
Bacteria infection: Staphylococcus aureus MW2. 4 hours of exposure. Transcripts that showed significantly increased expression after N2 animals had 4 hours of infection by Staphylococcus aureus (MW2). DEseq 1.18.0, adjusted p-value < 0.05. WBPaper00056471:S.aureus-4h_upregulated_N2
  Transcripts that showed significantly changed expression in 6-day post-L4 adult hermaphrodite comparing to in 1-day post L4 adult hermaphrodite animals. Sleuth WBPaper00051558:aging_regulated
  Transcripts that showed significantly altered expression after 24 hour exposure to stavudine (d4T) starting at L1 lava stage. DESeq WBPaper00053302:stavudine_24h_regulated
  Transcripts that showed significantly increased expression in sftb-1(cer6) deletion homozygous comparing to to in N2 animals at L4 larva stage. DESeq2, fold change > 2 WBPaper00058725:sftb-1(cer6)_downregulated
  Genes that showed expression levels higher than the corresponding reference sample (L3/L4 all cell reference). A Mann-Whitney U test with an empirical background model and FDR correction for multiple testing was used to detect expressed transcripts (Benjamini and Hochberg 1995). Genes and TARs with an FDR <= 0.05 were reported as expressed above background. Authors detected differentially expressed transcripts using a method based on linear models. Genes and TARs were called differentially expressed if the FDR was <= 0.05 and the fold change (FC) >= 2.0. To more strictly correct for potential false-positives resulting from multiple sample comparisons, authors divided individual FDR estimates by the number of samplesor sample comparisons, respectively. This resulted in an adjusted FDR of 1.3 * 0.0001 for expression above background and of 7.4 * 0.0001 for differential expression. Authors called genes selectively enriched in a given tissue if they met the following requirements: (1) enriched expression in a given tissue (FDR <= 0.05 and FC >= 2.0), (2) fold change versus reference among the upper 40% of the positive FC range observed for this gene across all tissues, and (3) fold-change entropy among the lower 40% of the distribution observed for all genes. WBPaper00037950:dopaminergic-neurons_L3-L4-larva_expressed
  Transcripts that showed significantly increased expression in mep-1(ne4629[MEP-1-GFP-Degron]) in gonads dissected from 1-day old adult animals. Salmon was used to map the mRNA-seq reads with the worm database WS268, and its output files were imported to DESeq2 in R. The differentially expressed genes were filtered by fold change more than 2 and adjusted p-value < 0.05. The scatter plots were generated by the plot function in R. WBPaper00061479:mep-1(ne4629)_upregulated
  Genes that showed expression levels higher than the corresponding reference sample (L2 all cell reference). A Mann-Whitney U test with an empirical background model and FDR correction for multiple testing was used to detect expressed transcripts (Benjamini and Hochberg 1995). Genes and TARs with an FDR <= 0.05 were reported as expressed above background. Authors detected differentially expressed transcripts using a method based on linear models. Genes and TARs were called differentially expressed if the FDR was <= 0.05 and the fold change (FC) >= 2.0. To more strictly correct for potential false-positives resulting from multiple sample comparisons, authors divided individual FDR estimates by the number of samplesor sample comparisons, respectively. This resulted in an adjusted FDR of 1.3 * 0.0001 for expression above background and of 7.4 * 0.0001 for differential expression. Authors called genes selectively enriched in a given tissue if they met the following requirements: (1) enriched expression in a given tissue (FDR <= 0.05 and FC >= 2.0), (2) fold change versus reference among the upper 40% of the positive FC range observed for this gene across all tissues, and (3) fold-change entropy among the lower 40% of the distribution observed for all genes. WBPaper00037950:glr-1(+)-neurons_L2-larva_expressed
  Transcripts that showed altered expression in cat-1(RNAi) animals comparing to control animals injected with empty vector. p-value <= 0.05 WBPaper00066902:cat-1(RNAi)_regulated
  Transcripts that showed significantly increased expression in hda-1(RNAi) embryos comparing to control animals. DESeq2, fold change > 2, FDR < 0.05. WBPaper00067044:hda-1(RNAi)_upregulated
  Transcripts detected in germline isolated from day-1 adult hermaphrodite animals. All three experiments have CPM >= 1. WBPaper00067147:germline_expressed
  Transcripts that showed significantly decreased expression in pfd-6(gk493446); daf-2(e1370) comparing to in daf-2(e1370). Limma version 3.24.15. Fold change < 0.67 (p < 0.05). WBPaper00055827:pfd-6(gk493446)_downregulated
  Transcripts that showed significantly decreased expression in nhl-2(ok818) comparing to in N2 at 25C. EdgeR, FDR < 0.05, fold change < 0.5. WBPaper00055971:nhl-2(ok818)_25C_upregulated
  Transcripts that showed decreased expression in hlh-11(ko1) knockout strain comparing to in wild type background. DESeq2, FDR < 0.05 WBPaper00060683:hlh-11(ko1)_downregulated

7 Expression Patterns

Remark Reporter Gene Primary Identifier Pattern Subcellular Localization
    Expr2034240 Single cell embryonic expression. Only cell types with an expression fraction of greater 0.2 of the maximum expressed fraction are labeled (Full data can be downloaded from http://caltech.wormbase.org/pub/wormbase/datasets-published/packer2019/). The colors represent the broad cell class to which the cell type has been assigned. The size of the point is proportional to the log2 of the numbers of cells in the dataset of that cell type. Interactive visualizations are available as a web app (https://cello.shinyapps.io/celegans/) and can also be installed as an R package (https://github.com/qinzhu/VisCello.celegans).  
    Expr1032460 Tiling arrays expression graphs  
    Expr2597 SOP-2::GFP localizes to the cell nuclei of essentially all somatic cells. bxEx99 and bxEx103 arrays produced essentially identical SOP-2::GFP expression patterns. SOP-2::GFP localizes to the cell nuclei of essentially all somatic cells and is apparent from the 50-cell stage embryo onward. At first, expression is weak and diffuse within nuclei, but, by the 200-cell stage, it becomes stronger, and distinct nuclear speckles, which is called "SOP-2 bodies," appear. By the comma stage (~400 cells), the nuclei of most somatic cells contain SOP-2 bodies, but their number and size varies among different cell types. Hypodermal and gut nuclei, which undergo endoreduplication, contain large numbers of SOP-2 nuclear bodies (hypodermis, 39.8+/-8.8, n=14; gut, >500, n=2) compared with other cell types (seam cells, 8.8+/-3.7, n=8; neurons, 8.4+/-2.7, n=15). No expression is detected in the nucleolus.
    Expr2016005 Single cell embryonic expression. Only cell types with an expression fraction of greater 0.2 of the maximum expressed fraction are labeled (Full data can be downloaded from http://caltech.wormbase.org/pub/wormbase/datasets-published/packer2019/). The colors represent the broad cell class to which the cell type has been assigned. The size of the point is proportional to the log2 of the numbers of cells in the dataset of that cell type. Interactive visualizations are available as a web app (https://cello.shinyapps.io/celegans/) and can also be installed as an R package (https://github.com/qinzhu/VisCello.celegans).  
    Expr1025868 Developmental gene expression time-course. Raw data can be downloaded from ftp://caltech.wormbase.org/pub/wormbase/datasets-published/levin2012  
    Expr1146873 Developmental gene expression time-course. Raw data can be downloaded from ftp://caltech.wormbase.org/pub/wormbase/datasets-published/hashimshony2015  
    Expr12241   Expression of SOP-2::GFP in the early adult was localized to nuclear bodies.

27 GO Annotation

Annotation Extension Qualifier
  involved_in
  involved_in
  involved_in
  involved_in
  involved_in
  involved_in
has_input(WB:WBGene00004945)|has_input(WB:WBGene00023405) involved_in
  involved_in
  located_in
  located_in
  located_in
  located_in
  involved_in
  involved_in
  enables
occurs_in(GO:0016604) enables
  enables
  enables
  enables
  involved_in
  involved_in
  involved_in
  involved_in
  involved_in
  enables
  enables
  enables

0 Homologues

1 Locations


Feature . Primary Identifier
Start End Strand
WBGene00004945 12339944 12344799 1

27 Ontology Annotations

Annotation Extension Qualifier
  involved_in
  involved_in
  involved_in
  involved_in
  involved_in
  involved_in
has_input(WB:WBGene00004945)|has_input(WB:WBGene00023405) involved_in
  involved_in
  located_in
  located_in
  located_in
  located_in
  involved_in
  involved_in
  enables
occurs_in(GO:0016604) enables
  enables
  enables
  enables
  involved_in
  involved_in
  involved_in
  involved_in
  involved_in
  enables
  enables
  enables

0 Regulates Expr Cluster

1 Sequence

Length
4856

1 Sequence Ontology Term

Identifier Name Description
gene  

3 Strains

WormBase ID
WBStrain00036186
WBStrain00007188
WBStrain00008579

1 Upstream Intergenic Region

WormBase ID Name Sequence Name Length (nt) Chromosome Location Organism
intergenic_region_chrII_12338094..12339943   1850 II: 12338094-12339943 Caenorhabditis elegans