KEIA
Knowledge Extraction, Integration and Applications
UNS
I3S
CNRS
Accueil > Implementations > GenMiner

GenMiner

Title: Genomic data Miner

Author: Ricardo Martinez, Nicolas Pasquier, Claude Pasquier.

Description: Java implementation of the GenMiner algorithm for mining equivalence classes and minimal non-redundant association rule from gene expression data.

Implementation: Executable JAR file with graphic user interface.

Details: This implementation integrates the R implementation of the Normalized Discretization method (NorDi) to preprocess gene expression data and the Java implementation of the JClose algorithm (JClose) to extract equivalence classes and minimal non-redundant association rules from these data.

Software

Reference

GenMiner: Mining informative association rules from genomic data, Ricardo Martinez, Nicolas Pasquier and Claude Pasquier, Bioinformatics, Oxford University Press, September, 2008.

Experimental results

These results were obtained from the annotations enriched Eisen et al. dataset containing integrated gene expression measures for 2465 Yeast genes and 737 columns (79 discretized gene expression levels and 658 gene annotations).

The gene expression measures were discretized by the NorDi algorithm with a 95% confidence level. The minimal confidence threshold for JClose was set to 50% and the minimal support threshold was set to 0.3% (association rules extracted correspond to at least 7 genes).

File Description
Equivalence classes Frequent closed itemsets and their generators extracted by JClose with a minimal support threshold of 0.003. Each equivalence class is represented by a line of the form:
[Generator] [Closed itemset] n
where n is the support (number of genes) of the equivalence class.
Exact associations rules Informative basis for exact association rules (confidence = 100%) displayed in the form:
[antecedent] => [consequent] supp=s conf=c
where s is the support and c is the confidence of the rule.
Approximate associations rules Informative basis for approximate association rules, with a confidence greater or equals to 0.5, displayed in the form:
[antecedent] -> [consequent] supp=s conf=c
where s is the support and c is the confidence of the rule.

GenMiner resources on the IDBC web site

Related publications

Mining association rule bases from integrated genomic data and annotations, Ricardo Martinez, Nicolas Pasquier and Claude Pasquier, Proceedings of the CIBB international conference on Computational Intelligence methods for Bioinformatics and Biostatistics, Salerno, Italy, 2008.

GenMiner: Mining informative association rules from genomic data, Ricardo Martinez, Claude Pasquier and Nicolas Pasquier, Proceedings of the IEEE BIBM international conference on Bioinformatics and Biomedecine, pages 15-22, IEEE Computer Society, 2007.

Knowledge integration models for mining gene expression data, Ricardo Martinez, PhD Thesis, Université de Nice Sophia Antipolis, 2007.