OneClick Mining

Software GUI

motifs à trier liste des objets

Description of the GUI elements.

Click on an element of the image to show its description.

Summary of our project

During the last years, the data storage capacity in computers has skyrocketed. The gathering of huge quantities of informations has become usual. It is now really easy to have access to these huge data quantities : the temperature about regions of the world during the last ten years, the purchase done by customers in shops, the results of a survey about any problem, etc. However, those data are raw and unprocessed, often without labels or with bad ones, even if those data contains a large volume of important informations.

Schematic of the different steps of data mining

The data mining is a complex process which consist of the gathering then the processing of data to only keep the most relevant parts : these are the selection step and the preprocess step. The data extracted are then used as input for severals algorithms during a mining step. The results obtained are processed one more time during the postprocessing step in order to keep only the relevant ones with a pattern structure. It is the mining step which may cause a problem : we need to select the most appropriate algorithms and choose their parameters so those algorithms perform well and give interesting results for the user.

This project for forth-year students is a data mining software, adapted to suit a user who has no experience in data mining. That user would only need to press a unique button to get the results. This concept is called OneClick Mining and is presented in the research article (1)One Click Mining - Interactive Local Pattern Discorvery through Implicit Preference and Performance Learning.

Our Team

  • Laurence ROZE
    Supervisor of our project and member of the research team Lacodam at INRIA
  • Ibamar BA
    Student in fourth year in the IT department of the INSA of Rennes, having chosen the option BigData
  • Francesco BARIATTI
    Student in fourth year in the IT department of the INSA of Rennes, having chosen the option Wide Scale Systems
  • Pierre Nicolas EUDE
    Student in fourth year in the IT department of the INSA of Rennes, having chosen the option Media and Interactions
  • Violaine FABRY
    Student in fourth year in the IT department of the INSA of Rennes, having chosen the option Wide Scale Systems
  • Gregrory MARTIN
    Student in fourth year in the IT department of the INSA of Rennes, having chosen the option Wide Scale Systems
  • Marie LOUP
    EStudent in fourth year in the IT department of the INSA of Rennes, having chosen the option Media and Interactions
  • Louis-Marie RENAUD
    Student in fourth year in the IT department of the INSA of Rennes, having chosen the option BigData

Learning cycle and Mining cycle

Learning cycle and Mining cycle

Learning cycle and Mining cycle, click on an element of the image

General working

The drawing describes the general functionning of the software OneClick Mining. First, it is made of the user part which was presented earlier, then the internal functionning.
For each click on the Mining button made by the user, the utility function is updated from the list of pattern which the user has judged interesting and which he has deleted. That function could be seen as a snapshot about the preferences of the user. Applied to a pattern and its interestingness measures, that function will tell us with that snapshot if the software thinks that the user would find the pattern interesting or not. Those interestingness measures are values describing the associated pattern while evaluating its relevance according to severals criterias such as the number of attributes. There are numerous measures and the choice of the measure is made according to the used algorithm. That function is calculated for each click of the user on the button Mining, thus is updated from the user experience on the previous patterns shown to him during the last turn.
A new learning cycle is then launched. During this learning cycle, numerous mining cycle are running : several data mining algorithms are launched one after another. Only one algorithm is launched per mining cycle. Those algorithms produce patterns which are shown to the user the next time he clicks on the button Mining. For the software OneClick Mining, a pattern is a pair of values : the pattern itself and its interestingness measures.

Bibliography

  • (1)Mario Boley, Michael Mampaey, Bo Kang, Pavel Tokmakov, Stefan Wrobel. One Click Mining - Interactive Local Pattern Discorvery through Implicit Preference and Performance Learning. In Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics, IDEA’13, pages 27–35, New York, NY, USA, 2013. ACM.
  • (2)Rabin Allesiardo, Raphaël Féraud. Un algorithme pour le problème des bandits manchots avec stationnarité par parties.
  • (3)Pannaga Shivaswamy, Karthik Raman, Thorsten Joachims. Online learning to diversify from implicit feedback. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’12, pages 705–713, New York, NY, USA, 2012. ACM.
  • (4)Nicolo Cesa-Bianchi and Gábor Lugosi. Prediction, learning, and games.2006. Cambridge University Press.
  • (5)Francisco Herrera, Cristobal José, Pedro Gonzalez, Maria José. An overview on subgroup discovery : Foundations and applications. Knowl. Inf. Syst., 29(3) :495–525, December 2011.
  • (6)Peggy Cellier. Non-supervised symbolic methods. University lecture at INSA Rennes, January 2016.
  • (7)P. Fournier-Viger, A. Gomariz, T. Gueniche, A. Soltani, C. Wu., and V. S. Tseng. Spmf : a java open-source pattern mining library. Journal of Machine Learning Research (JMLR), 15 :3389–3393, 2014.