GenMiner
Title: Genomic data Miner Author: Ricardo Martinez, Nicolas Pasquier, Claude Pasquier. Description: Java implementation of the GenMiner algorithm for mining equivalence classes and minimal non-redundant association rule from gene expression data.
Implementation: Executable JAR file with graphic user interface. Details: This
implementation integrates the R implementation of the Normalized Discretization method (NorDi) to preprocess gene expression data and the Java implementation of the JClose algorithm (JClose) to extract equivalence classes and minimal non-redundant association rules
from these data. Software
ReferenceGenMiner: Mining informative association rules from genomic data, Ricardo Martinez, Nicolas Pasquier and Claude Pasquier, Bioinformatics, Oxford University Press, September, 2008.
Experimental resultsThese results were obtained from the annotations enriched Eisen et al. dataset containing integrated gene expression measures for 2465 Yeast genes and 737 columns (79 discretized gene expression
levels and 658 gene annotations).
The gene expression measures were discretized
by the NorDi algorithm with a 95% confidence level. The minimal confidence threshold for JClose was set to 50% and the minimal support threshold was set to 0.3% (association rules extracted correspond to at least 7 genes).
| File |
Description |
| Equivalence classes |
Frequent
closed itemsets and their generators extracted by JClose with a
minimal support threshold of 0.003. Each equivalence class is represented by a line of the form:[Generator] [Closed itemset] n where n is the support (number of genes) of the equivalence class. |
| Exact associations rules |
Informative basis for exact association rules (confidence = 100%) displayed in the form:[antecedent] => [consequent] supp=s conf=c where s is the support and c is the confidence of the rule. | | Approximate associations rules |
Informative basis for approximate association rules, with a confidence greater or equals to 0.5, displayed in the form:[antecedent] -> [consequent] supp=s conf=c where s is the support and c is the confidence of the rule.
|
Related publicationsMining association rule bases from integrated genomic data and annotations, Ricardo Martinez, Nicolas Pasquier and Claude Pasquier,
Proceedings of the CIBB international conference on Computational
Intelligence methods for Bioinformatics and Biostatistics, Salerno,
Italy, 2008. GenMiner: Mining informative association rules from genomic data, Ricardo Martinez, Claude Pasquier and Nicolas Pasquier,
Proceedings of the IEEE BIBM international conference on
Bioinformatics and Biomedecine, pages 15-22, IEEE Computer Society, 2007. Knowledge integration models for mining gene expression data, Ricardo Martinez, PhD Thesis, Université de Nice Sophia Antipolis, 2007.
|