WormMine

WS295

Intermine data mining platform for C. elegans and related nematodes

Gene :

WormBase Gene ID  ? WBGene00004781 Gene Name  set-1
Sequence Name  ? T26A5.7 Brief Description  set-1 encodes a SET domain-containing protein predicted to function as a histone lysine N-methyltransferase; set-1 is an essential gene that is ubiquitously expressed early in development.
Organism  Caenorhabditis elegans Automated Description  Predicted to enable RNA binding activity and histone H4K20 methyltransferase activity. Involved in embryo development. Located in nucleus. Expressed in several structures, including P3.p hermaphrodite; P4.p hermaphrodite; P5.p hermaphrodite; P7.p hermaphrodite; and P8.p hermaphrodite. Is an ortholog of human KMT5A (lysine methyltransferase 5A).
Biotype  SO:0001217 Genetic Position  III :-1.25923 ±0.002972
Length (nt)  ? 1587
Quick Links:
 
Quick Links:
 

1 Organism

Name Taxon Id
Caenorhabditis elegans 6239

1 Synonyms

Value
WBGene00004781

Genomics

3 Transcripts

WormMine ID Sequence Name Length (nt) Chromosome Location
Transcript:T26A5.7a.1 T26A5.7a.1 1306   III: 6450792-6452372
Transcript:T26A5.7b.2 T26A5.7b.2 1133   III: 6450795-6452061
Transcript:T26A5.7b.1 T26A5.7b.1 1317   III: 6450795-6452378
 

Other

2 CDSs

WormMine ID Sequence Name Length (nt) Chromosome Location
CDS:T26A5.7b T26A5.7b 318   III: 6451353-6451490
CDS:T26A5.7a T26A5.7a 729   III: 6451353-6451490

17 RNAi Result

WormBase ID
WBRNAi00054169
WBRNAi00009222
WBRNAi00026404
WBRNAi00113443
WBRNAi00005063
WBRNAi00089618
WBRNAi00089588
WBRNAi00091486
WBRNAi00081701
WBRNAi00022786
WBRNAi00035982
WBRNAi00007886
WBRNAi00091984
WBRNAi00091985
WBRNAi00106488
WBRNAi00091987
WBRNAi00091992

25 Allele

Public Name
gk964518
gk176624
gk964032
gk964033
WBVar01952273
gk751343
gk839426
gk353746
gk327866
gk515995
gk835621
gk703289
gk752586
n4617
gk471745
gk530644
WBVar00062227
gk852457
gk534125
WBVar00062232
gk854238
tm1821
eh12
WBVar01446177
WBVar01446176

1 Chromosome

WormBase ID Organism Length (nt)
III Caenorhabditis elegans 13783801  

1 Chromosome Location


Feature . Primary Identifier
Start End Strand
WBGene00004781 6450792 6452378 -1

4 Data Sets

Name URL
WormBaseAcedbConverter  
GO Annotation data set  
C. elegans genomic annotations (GFF3 Gene)  
Panther orthologue and paralogue predictions  

1 Downstream Intergenic Region

WormBase ID Name Sequence Name Length (nt) Chromosome Location Organism
intergenic_region_chrIII_6450692..6450791   100 III: 6450692-6450791 Caenorhabditis elegans

129 Expression Clusters

Regulated By Treatment Description Algorithm Primary Identifier
  Transcripts expressed in neuronal cells, by analyzingfluorescence-activated cell sorted (FACS) neurons. DESeq. False discovry rate (FDR) < 0.1. WBPaper00048988:neuron_expressed
  Genes that showed expression levels higher than the corresponding reference sample (embryonic 24hr reference). A Mann-Whitney U test with an empirical background model and FDR correction for multiple testing was used to detect expressed transcripts (Benjamini and Hochberg 1995). Genes and TARs with an FDR <= 0.05 were reported as expressed above background. Authors detected differentially expressed transcripts using a method based on linear models. Genes and TARs were called differentially expressed if the FDR was <= 0.05 and the fold change (FC) >= 2.0. To more strictly correct for potential false-positives resulting from multiple sample comparisons, authors divided individual FDR estimates by the number of samplesor sample comparisons, respectively. This resulted in an adjusted FDR of 1.3 * 0.0001 for expression above background and of 7.4 * 0.0001 for differential expression. Authors called genes selectively enriched in a given tissue if they met the following requirements: (1) enriched expression in a given tissue (FDR <= 0.05 and FC >= 2.0), (2) fold change versus reference among the upper 40% of the positive FC range observed for this gene across all tissues, and (3) fold-change entropy among the lower 40% of the distribution observed for all genes. WBPaper00037950:all-neurons_L1-larva_expressed
adult vs dauer larva Transcripts that showed differential expression in adult vs dauer lava in N2 animals at 20C. N.A. WBPaper00050488:adult_vs_dauer_regulated_N2_20C
Bacteria infection: Enterococcus faecalis OG1RF. Exposure for 16 hours. Transcripts that showed significantly decreased expression in N2 after animals were exposed to E. faecalis OG1RF for 16 hours comparing to exposure to E. Coli OP50. Cuffcompare and Cuffdiff WBPaper00056090:E.faecalis_downregulated_N2
  Transcripts expressed in GABAergic neuron, according to PAT-Seq analysis using Punc-47-GFP-3XFLAG mRNA tagging. Cufflinks FPKM value >=1. WBPaper00050990:GABAergic-neuron_expressed
  Transcripts expressed in intestine, according to PAT-Seq analysis using Pges-1-GFP-3XFLAG mRNA tagging. Cufflinks FPKM value >=1. WBPaper00050990:intestine_expressed
  Genes with expression level regulated by genotype (N2 vs CB4856) and age at L3 larva and Late reproduction stage (96 hours at 24 centigrade). For model 2, authors used 100 permutations to estimate the FDR threshold. Per permutation, genotypes and ages were independently randomly distributed, keeping the among-gene structure intact. Then for each spot (23,232) on the array, model 2 was tested. The obtained P-values were used to estimate a threshold for each of the explanatory factors. Authors also used a genome-wide threshold of -log10 P-value = 2, which resembles an FDR of 0.072 and 0.060 for marker and the interaction age-marker for the developing worms and FDR of 0.050 and 0.065 for marker and age-marker for the aging worms. For the physiological age effect, authors used a log10 P-value = 8 in developing worms (0.012 FDR) and -log10 P-value = 6 (0.032 FDR). WBPaper00040858:eQTL_age_regulated_developing
  Transcripts that showed significantly increased expression in day 1 adult hermaphrodite comparing to in L4 larva fem-3(q20) animals. Fold change > 2, FDR < 0.05 WBPaper00064088:Day-1-adult_vs_L4_upregulated_fem-3(q20)
  Genes with expression level regulated by genotype (N2 vs CB4856) at L3 larva and Late reproduction stage (96 hours at 24 centigrade). For model 2, authors used 100 permutations to estimate the FDR threshold. Per permutation, genotypes and ages were independently randomly distributed, keeping the among-gene structure intact. Then for each spot (23,232) on the array, model 2 was tested. The obtained P-values were used to estimate a threshold for each of the explanatory factors. Authors also used a genome-wide threshold of -log10 P-value = 2, which resembles an FDR of 0.072 and 0.060 for marker and the interaction age-marker for the developing worms and FDR of 0.050 and 0.065 for marker and age-marker for the aging worms. For the physiological age effect, authors used a log10 P-value = 8 in developing worms (0.012 FDR) and -log10 P-value = 6 (0.032 FDR). WBPaper00040858:eQTL_regulated_developing
  Transcripts that showed significantly increased expression in day 3 adult hermaphrodite comparing to in L4 larva fem-3(q20) animals. Fold change > 2, FDR < 0.05 WBPaper00064088:Day-3-adult_vs_L4_upregulated_fem-3(q20)
Bacteria diet: Escherichia coli HB101. Fed for 30 generations. Transcripts that showed significantly decreased expression after fed by bacteria E. coli HB101 for 30 generations comparing to animals fed by E. coli OP50. DESeq2 fold change > 2, p-value < 0.01. WBPaper00061007:HB101_downregulated
Bacteria diet: Sphingomonas aquatilis Yellow. Fed for 30 generations. Transcripts that showed significantly decreased expression after fed by bacteria Sphingomonas aquatilis (Yellow) for 30 generations comparing to animals fed by E. coli OP50. DESeq2 fold change > 2, p-value < 0.01. WBPaper00061007:S.aquatilis_downregulated
  Maternal class (M): genes that are called present in at least one of the three PC6 replicates. A modified Welch F statistic was used for ANOVA. For each gene, regressed error estimates were substituted for observed error estimates. The substitution is justified by the lack of consistency among the most and least variable genes at each time point. Regressed error estimates were abundance-dependent pooled error estimates that represented a median error estimate from a window of genes of similar abundance to the gene of interest. A randomization test was used to compute the probability Pg of the observed F statistic for gene g under the null hypothesis that developmental time had no effect on expression. P-values were not corrected for multiple testing. [cgc5767]:expression_class_M
  Significantly differentially expressed genes as determined by microarray analysis of wild-type and cde-1 mutant germlines. RNAs that changed at least 2-fold with a probability of p < 0.05 were considered differentially regulated between wildtype and cde-1. WBPaper00035269:cde-1_regulated
25C vs. 20C Transcripts that showed significantly increased expression in 1-day post L4 adult hermaphrodite N2 grown at 25C, comparing to in N2 animals grown at 20C. CuffDiff, fold change > 2. WBPaper00065096:25C_vs_20C_upregulated
  Transcripts that showed significantly increased expression in 10-days post L4 adult hermaphrodite N2 grown at 20C, comparing to in 1-day post L4 adult hermaphrodite N2 animals grown at 20C. CuffDiff, fold change > 2. WBPaper00065096:Day10_vs_Day1_upregulated
  Genes that showed expression levels higher than the corresponding reference sample (L2 all cell reference). A Mann-Whitney U test with an empirical background model and FDR correction for multiple testing was used to detect expressed transcripts (Benjamini and Hochberg 1995). Genes and TARs with an FDR <= 0.05 were reported as expressed above background. Authors detected differentially expressed transcripts using a method based on linear models. Genes and TARs were called differentially expressed if the FDR was <= 0.05 and the fold change (FC) >= 2.0. To more strictly correct for potential false-positives resulting from multiple sample comparisons, authors divided individual FDR estimates by the number of samplesor sample comparisons, respectively. This resulted in an adjusted FDR of 1.3 * 0.0001 for expression above background and of 7.4 * 0.0001 for differential expression. Authors called genes selectively enriched in a given tissue if they met the following requirements: (1) enriched expression in a given tissue (FDR <= 0.05 and FC >= 2.0), (2) fold change versus reference among the upper 40% of the positive FC range observed for this gene across all tissues, and (3) fold-change entropy among the lower 40% of the distribution observed for all genes. WBPaper00037950:A-class-motor-neurons_L2-larva_expressed
  Transcripts that showed significantly decreased expression in tetraploid N2 comparing to diploid N2 animals at L4 larva stage. DESeq2 R package (1.20.0), fold change > 2, and FDR < 0.05. WBPaper00066110:tetraploid_vs_diploid_downregulated
  Genes that showed expression levels higher than the corresponding reference sample (L2 all cell reference). A Mann-Whitney U test with an empirical background model and FDR correction for multiple testing was used to detect expressed transcripts (Benjamini and Hochberg 1995). Genes and TARs with an FDR <= 0.05 were reported as expressed above background. Authors detected differentially expressed transcripts using a method based on linear models. Genes and TARs were called differentially expressed if the FDR was <= 0.05 and the fold change (FC) >= 2.0. To more strictly correct for potential false-positives resulting from multiple sample comparisons, authors divided individual FDR estimates by the number of samplesor sample comparisons, respectively. This resulted in an adjusted FDR of 1.3 * 0.0001 for expression above background and of 7.4 * 0.0001 for differential expression. Authors called genes selectively enriched in a given tissue if they met the following requirements: (1) enriched expression in a given tissue (FDR <= 0.05 and FC >= 2.0), (2) fold change versus reference among the upper 40% of the positive FC range observed for this gene across all tissues, and (3) fold-change entropy among the lower 40% of the distribution observed for all genes. WBPaper00037950:coelomocytes_L2-larva_expressed
  Genes that showed expression levels higher than the corresponding reference sample (L3/L4 all cell reference). A Mann-Whitney U test with an empirical background model and FDR correction for multiple testing was used to detect expressed transcripts (Benjamini and Hochberg 1995). Genes and TARs with an FDR <= 0.05 were reported as expressed above background. Authors detected differentially expressed transcripts using a method based on linear models. Genes and TARs were called differentially expressed if the FDR was <= 0.05 and the fold change (FC) >= 2.0. To more strictly correct for potential false-positives resulting from multiple sample comparisons, authors divided individual FDR estimates by the number of samplesor sample comparisons, respectively. This resulted in an adjusted FDR of 1.3 * 0.0001 for expression above background and of 7.4 * 0.0001 for differential expression. Authors called genes selectively enriched in a given tissue if they met the following requirements: (1) enriched expression in a given tissue (FDR <= 0.05 and FC >= 2.0), (2) fold change versus reference among the upper 40% of the positive FC range observed for this gene across all tissues, and (3) fold-change entropy among the lower 40% of the distribution observed for all genes. WBPaper00037950:dopaminergic-neurons_L3-L4-larva_expressed
  Genes that showed expression levels higher than the corresponding reference sample (L2 all cell reference). A Mann-Whitney U test with an empirical background model and FDR correction for multiple testing was used to detect expressed transcripts (Benjamini and Hochberg 1995). Genes and TARs with an FDR <= 0.05 were reported as expressed above background. Authors detected differentially expressed transcripts using a method based on linear models. Genes and TARs were called differentially expressed if the FDR was <= 0.05 and the fold change (FC) >= 2.0. To more strictly correct for potential false-positives resulting from multiple sample comparisons, authors divided individual FDR estimates by the number of samplesor sample comparisons, respectively. This resulted in an adjusted FDR of 1.3 * 0.0001 for expression above background and of 7.4 * 0.0001 for differential expression. Authors called genes selectively enriched in a given tissue if they met the following requirements: (1) enriched expression in a given tissue (FDR <= 0.05 and FC >= 2.0), (2) fold change versus reference among the upper 40% of the positive FC range observed for this gene across all tissues, and (3) fold-change entropy among the lower 40% of the distribution observed for all genes. WBPaper00037950:excretory-cell_L2-larva_expressed
  Transcripts that showed significantly decreased expression in hpl-2(tm1489) comparing to in N2 animals. DESeq2, adjusted p-value < 0.05, log2 fold change > 2 or < -2. WBPaper00054493:hpl-2(tm1489)_downregulated
  Genes that showed expression levels higher than the corresponding reference sample (L3/L4 all cell reference). A Mann-Whitney U test with an empirical background model and FDR correction for multiple testing was used to detect expressed transcripts (Benjamini and Hochberg 1995). Genes and TARs with an FDR <= 0.05 were reported as expressed above background. Authors detected differentially expressed transcripts using a method based on linear models. Genes and TARs were called differentially expressed if the FDR was <= 0.05 and the fold change (FC) >= 2.0. To more strictly correct for potential false-positives resulting from multiple sample comparisons, authors divided individual FDR estimates by the number of samplesor sample comparisons, respectively. This resulted in an adjusted FDR of 1.3 * 0.0001 for expression above background and of 7.4 * 0.0001 for differential expression. Authors called genes selectively enriched in a given tissue if they met the following requirements: (1) enriched expression in a given tissue (FDR <= 0.05 and FC >= 2.0), (2) fold change versus reference among the upper 40% of the positive FC range observed for this gene across all tissues, and (3) fold-change entropy among the lower 40% of the distribution observed for all genes. WBPaper00037950:PVD-OLL-neurons_L3-L4-larva_expressed
  Transcripts that showed significantly increased expression in ilc-17.1(syb5296) comparing to in N2 animals at L4 larva stage. DESeq2, fold change > 2, FDR < 0.05. WBPaper00066594:ilc-17.1(syb5296)_upregulated
  Transcripts detected in germline isolated from day-1 adult hermaphrodite animals. All three experiments have CPM >= 1. WBPaper00067147:germline_expressed
  Genes that were not enriched in either spermatogenic fem-3(q96gf) nor oogenic fog-2(q71) gonads, according to RNAseq analysis. To identify differentially expressed transcripts, authors used R/Bioconductor package DESeq. WBPaper00045521:Gender_Neutral
  Transcripts that showed altered expression from P0 to F2 generation animals after N2 parental generation were treated with antimycin, but not in damt-1(gk961032) P0 to F2 animals after the parenal generaton were treated with antimycin. N.A. WBPaper00055862:antimycin_damt-1(gk961032)_regulated
  Transcripts that showed significantly decreased expression in the neurons of bcat-1(RNAi) animals at 5-days post L4 adult hermaphrodite stage, comparing to animals injected with empty vector. DESeq2. FDR < 0.05. WBPaper00060459:bcat-1(RNAi)_downregulated
  Transcripts that showed decreased expression in hlh-11(ko1) knockout strain comparing to in wild type background. DESeq2, FDR < 0.05 WBPaper00060683:hlh-11(ko1)_downregulated
Bacteria infection: Enterococcus faecalis OG1RF. Exposure for 16 hours. Transcripts that showed significantly decreased expression in hpx-2(dg047) after animals were exposed to E. faecalis OG1RF for 16 hours comparing to exposure to E. Coli OP50. Cuffcompare and Cuffdiff WBPaper00056090:E.faecalis_downregulated_hpx-2(dg047)

12 Expression Patterns

Remark Reporter Gene Primary Identifier Pattern Subcellular Localization
Strain: BC10574 [set-1::gfp] transcriptional fusion. PCR products were amplified using primer A: 5' [CTACGCTCATCAGGCAGTAGTTT] 3' and primer B 5' [CCTTCGAGTGACACCGCT] 3'. Expr6773 Adult Expression: Reproductive System; vulval muscle; seam cells; unidentified cells in head; unidentified cells in tail ; Larval Expression: seam cells; unidentified cells in head; unidentified cells in tail ;  
Also expressed in (comments from author) : unidentified cells in head, possibly labial sensilla. Strain: DM12747 [set-1::gfp] transcriptional fusion. PCR products were amplified using primer A: 5' [CTACGCTCATCAGGCAGTAGTTT] 3' and primer B 5' [CCTTCGAGTGACACCGCT] 3'. Expr6774 Adult Expression: pharynx; arcade cells; intestine; Reproductive System; vulval muscle; vulva other; hypodermis; seam cells; Nervous System; head neurons; labial sensilla; unidentified cells in head; Larval Expression: arcade cells; intestine; Reproductive System; developing vulva; hypodermis; seam cells; Nervous System; head neurons; labial sensilla; unidentified cells in head;  
    Expr1019100 Developmental gene expression time-course. Raw data can be downloaded from ftp://caltech.wormbase.org/pub/wormbase/datasets-published/levin2012  
Original chronogram file: chronogram.106.xml [T26A5.7:gfp] transcriptional fusion. Chronogram62    
    Expr1032370 Tiling arrays expression graphs  
    Expr14368    
    Expr1979 From late embryogenesis onwards, expression was observed in two rows of fourteen cells. The identity of these cells as the hypodermal seam cells was confirmed by immunohistochemistry using the MH27 antibody. Starting at the L3L4 stages, a dynamic pattern of expression was seen in the vulval precursor cells, and persisted into adulthood. Expression was also observed in at least six additional posterior cells that remain to be identified. No expression was observed in the gonad, presumably due to the previously described silencing of transgenes in the germline. In adult males, as well as expression in the seam cells, a specific pattern of expression was observed in the tail, in a number of unidentified cells. The protein is expressed almost ubiquitously in the embryos and the pattern becomes more and more specific during development. In all positive cells, fluorescence was observed in the nucleus.
    Expr1978 In situ hybridisation revealed a high level of expression in eggs, with ubiquitously expression in early embryos that declined and became more spatially restricted as development progressed. In L3L4 larvae, cells in the region of the developing vulva showed strong expression of set-1, and expression was also observed throughout the gonad. In adults, high expression was additionally seen in oocytes.  
    Expr2015764 Single cell embryonic expression. Only cell types with an expression fraction of greater 0.2 of the maximum expressed fraction are labeled (Full data can be downloaded from http://caltech.wormbase.org/pub/wormbase/datasets-published/packer2019/). The colors represent the broad cell class to which the cell type has been assigned. The size of the point is proportional to the log2 of the numbers of cells in the dataset of that cell type. Interactive visualizations are available as a web app (https://cello.shinyapps.io/celegans/) and can also be installed as an R package (https://github.com/qinzhu/VisCello.celegans).  
    Expr1157715 Developmental gene expression time-course. Raw data can be downloaded from ftp://caltech.wormbase.org/pub/wormbase/datasets-published/hashimshony2015  
    Expr1977 The gene showed a high level of expression in eggs and a decreased level of expression in successive stages of development.  
    Expr2033997 Single cell embryonic expression. Only cell types with an expression fraction of greater 0.2 of the maximum expressed fraction are labeled (Full data can be downloaded from http://caltech.wormbase.org/pub/wormbase/datasets-published/packer2019/). The colors represent the broad cell class to which the cell type has been assigned. The size of the point is proportional to the log2 of the numbers of cells in the dataset of that cell type. Interactive visualizations are available as a web app (https://cello.shinyapps.io/celegans/) and can also be installed as an R package (https://github.com/qinzhu/VisCello.celegans).  

20 GO Annotation

Annotation Extension Qualifier
  involved_in
  involved_in
  involved_in
  involved_in
  involved_in
  involved_in
  enables
  enables
  located_in
  located_in
  located_in
  located_in
  located_in
  located_in
  located_in
  involved_in
  part_of
  enables
  enables
  enables

9 Homologues

Type
least diverged orthologue
least diverged orthologue
orthologue
orthologue
orthologue
orthologue
least diverged orthologue
least diverged orthologue
least diverged orthologue

1 Locations


Feature . Primary Identifier
Start End Strand
WBGene00004781 6450792 6452378 -1

20 Ontology Annotations

Annotation Extension Qualifier
  involved_in
  involved_in
  involved_in
  involved_in
  involved_in
  involved_in
  enables
  enables
  located_in
  located_in
  located_in
  located_in
  located_in
  located_in
  located_in
  involved_in
  part_of
  enables
  enables
  enables

0 Regulates Expr Cluster

1 Sequence

Length
1587

1 Sequence Ontology Term

Identifier Name Description
gene  

1 Strains

WormBase ID
WBStrain00006135

1 Upstream Intergenic Region

WormBase ID Name Sequence Name Length (nt) Chromosome Location Organism
intergenic_region_chrIII_6452379..6454482   2104 III: 6452379-6454482 Caenorhabditis elegans